data protection

EU privacy experts push a decentralized approach to COVID-19 contacts tracing

A group of European privacy experts has proposed a decentralized system for Bluetooth-based COVID-19 contacts tracing which they argue offers greater protection against abuse and misuse of people’s data than apps which pull data into centralized pots.

The protocol — which they’re calling Decentralized Privacy-Preserving Proximity Tracing (DP-PPT) — has been designed by around 25 academics from at least seven research institutions across Europe, including the Swiss Federal Institute of Technology, ETH Zurich and KU Leuven in the Netherlands.

They’ve published a White Paper detailing their approach here.

The key element is that the design entails local processing of contacts tracing and risk on the user’s device, based on devices generating and sharing ephemeral Bluetooth identifiers (referred to as EphIDs in the paper).

A backend server is used to push data out to devices — i.e. when an infected person is diagnosed with COVID-19 a health authority would sanction the upload from the person’s device of a compact representation of EphIDs over the infectious period which would be sent to other devices so they could locally compute whether there is a risk and notify the user accordingly.

Under this design there’s no requirement for pseudonymized IDs to be centralized, where the pooled data would pose a privacy risk. Which in turn should make it easier to persuade EU citizens to trust the system — and voluntarily download contacts tracing app using this protocol — given it’s architected to resist being repurposed for individual-level state surveillance.

The group does discuss some other potential threats — such as posed by tech savvy users who could eavesdrop on data exchanged locally, and decompile/recompile the app to modify elements — but the overarching contention is such risks are small and more manageable vs creating centralized pots of data that risk paving the way for ‘surveillance creep’, i.e. if states use a public health crisis as an opportunity to establish and retain citizen-level tracking infrastructure.

The DP-PPT has been designed with its own purpose-limited dismantling in mind, once the public health crisis is over.

“Our protocol is demonstrative of the fact that privacy-preserving approaches to proximity tracing are possible, and that countries or organisations do not need to accept methods that support risk and misuse,” writes professor Carmela Troncoso, of EPFL. “Where the law requires strict necessity and proportionality, and societal support is behind proximity tracing, this decentralized design provides an abuse-resistant way to carry it out.”

In recent weeks governments all over Europe have been leaning on data controllers to hand over user data for a variety of coronavirus tracking purposes. Apps are also being scrambled to market by the private sector — including symptom reporting apps that claim to help researchers fight the disease. While tech giants spy PR opportunities to repackage persistent tracking of Internet users for a claimed public healthcare cause, however vague the actual utility.

The next big coronavirus tech push looks likely to be contacts-tracing apps: Aka apps that use proximity-tracking Bluetooth technology to map contacts between infected individuals and others.

This is because without some form of contacts tracing there’s a risk that hard-won gains to reduce the rate of infections by curtailing people’s movements will be reversed, i.e. once economic and social activity is opened up again. Although whether contacts tracing apps can be as effective at helping to contain COVID-19 as policymakers and technologists hope remains an open question.

What’s crystal clear right now, though, is that without a thoughtfully designed protocol that bakes in privacy by design contacts-tracing apps present a real risk to privacy — and, where they exist, to hard-won human rights. 

Torching rights in the name of combating COVID-19 is neither good nor necessary is the message from the group backing the DP-PPT protocol.

“One of the major concerns around centralisation is that the system can be expanded, that states can reconstruct a social graph of who-has-been-close-to-who, and may then expand profiling and other provisions on that basis. The data can be co-opted and used by law enforcement and intelligence for non-public health purposes,” explains University College London’s Dr Michael Veale, another backer of the decentralized design.

“While some countries may be able to put in place effective legal safeguards against this, by setting up a centralised protocol in Europe, neighbouring countries become forced to interoperate with it, and use centralised rather than decentralised systems too. The inverse is true: A decentralised system puts hard technical limits on surveillance abuses from COVID-19 bluetooth tracking across the world, by ensuring other countries use privacy-protective approaches.”

“It is also simply not necessary,” he adds of centralizing proximity data. “Data protection by design obliges the minimisation of data to that which is necessary for the purpose. Collecting and centralising data is simply not technically necessary for Bluetooth contact tracing.”

Last week we reported on another EU effort — by a different coalition of technologists and scientists, led by by Germany’s Fraunhofer Heinrich Hertz Institute for telecoms (HHI) — which has said it’s working on a “privacy preserving” standard for Covid-19 contacts tracing which they’ve dubbed: Pan-European Privacy-Preserving Proximity Tracing (PEPP-PT).

At the time it wasn’t clear whether or not the approach was locked to a centralized model of handling the pseudoanonymized IDs. Speaking to TechCrunch today, Hans-Christian Boos, one of the PEPP-PT project’s co-initiators, confirmed the standardization effort will support both centralized and decentralized approaches to handling contacts tracing.

The effort had faced criticizm from some in the EU privacy community for appearing to favor a centralized rather than decentralized approach — thereby, its critics contend, undermining the core claim to preserve user privacy. But, per Boos, it will in fact support both approaches — in a bid to maximize uptake around the world.

He also said it will be interoperable regardless of whether data is centralized or decentralized. (In the centralized scenario, he said the hope is that the not-for-profit that’s being set up to oversee PEPP-PT will be able to manage the centralized servers itself, pending proper financing — a step intended to further shrink the risk of data centralization in regions that lacks a human rights frameworks, for example.)

“We will have both options — centralized and decentralized,” Boos told TechCrunch. “We will offer both solutions, depending on who wants to use what, and we’ll make them operable. But I’m telling you that both solutions have their merits. I know that in the crypto community there is a lot of people who want decentraliztion — and I can tell you that in the health community there’s a lot of people who hate decentralization because they’re afraid that too many people have information about infected people.”

“In a decentralized system you have the simple problem that you would broadcast the anonymous IDs of infected people to everybody — so some countries’ health legislation will absolutely forbid that. Even though you have a cryptographic method, you’re broadcasting the IDs to all over the place — that’s the only way your local phone can find out have I been in contact or no,” Boos went on.

“That’s the drawback of a decentralized solution. Other than that it’s a very good thing. On a centralized solution you have the drawback that there is a single operator, whom you can choose to trust or not to trust — has access to anonymized IDs, just the same as if they were broadcast. So the question is you can have one party with access to anonymized IDs or do you have everybody with access to anonymized IDs because in the end you’re broadcasting them over the network [because] it’s spoofable.”

“If your assumption is that someone could hack the centralized service… then you have to also assume that someone could hack a router, which stuff goes through,” he added. “Same problem.

“That’s why we offer both solutions. We’re not religious. Both solutions offer good privacy. Your question is who would you trust more and who would you un-trust more? Would you trust more a lot of users that you broadcast something to or would you trust more someone who operates a server? Or would you trust more that someone can hack a router or that someone can hack the server? Both is possible, right. Both of these options are totally valid options — and it’s a religious discussion between crypto people… but we have to balance it between what crypto wants and what healthcare wants. And because we can’t make that decision we will end up offering both solutions.

“I think there has to be choice because if we are trying to build an international standard we should try and not be part of a religious war.”

Boos also said the project aims to conduct research into the respective protocols (centralized vs decentralized) to compare and conduct risk assessments based on access to the respective data.

“From a data protection point of view that data is completely anonymized because there’s no attachment to location, there’s no attachment to time, there’s no attachment to phone number, MAC address, SIM number, any of those. The only thing you know there is a contact — a relevant contact between two anonymous IDs. That’s the only thing you have,” he said. “The question that we gave the computer scientists and the hackers is if we give you this list — or if we give you this graph, what could you derive from it? In the graph they are just numbers connected to each other, the question is how can you derive anything from it? They are trying — let’s see what’s coming out.”

“There are lots of people trying to be right about this discussion. It’s not about being right; it’s about doing the right thing — and we will supply, from the initiative, whatever good options there are. And if each of them have drawbacks we will make those drawbacks public and we will try to get as much confirmation and research in on these as we can. And we will put this out so people can make their choices which type of the system they want in their geography,” he added.

“If it turns out that one is doable and one is completely not doable then we will drop one — but so far both look doable, in terms of ‘privacy preserving’, so we will offer both. If one turns out to be not doable because it’s hackable or you could derive meta-information at an unacceptable risk then we would drop it completely and stop offering the option.”

On the interoperability point Boos described it as “a challenge” which he said boils down to how the systems calculate their respective IDs — but he emphasized it’s being worked on and is an essential piece.

“Without that the whole thing doesn’t make sense,” he told us. “It’s a challenge why the option isn’t out yet but we’re solving that challenge and it’ll definitely work… There’s multiple ideas how to make that work.”

“If every country does this by itself we won’t have open borders again,” he added. “And if in a country there’s multiple applications that don’t share data then we won’t have a large enough set of people participating who can actually make infection tracing possible — and if there’s not a single place where we can have discussions about what’s the right thing to do about privacy well then probably everybody will do something else and half of them will use phone numbers and location information.”

The PEPP-PT coalition has not yet published its protocol or any code. Which means external experts wanting to chip in with informed feedback on specific design choices related to the proposed standard haven’t been able to get their hands on the necessary data to carry out a review.

Boos said they intend to open source the code this week, under a Mozilla licence. He also said the project is willing to take on “any good suggestions” as contributions.

“Currently only beta members have access to it because those have committed to us that they will update to the newest version,” he said. “We want to make sure that when we publish the first release of code it should have gone through data privacy validation and security validation — so we are as sure as we can be that there’s no major change that someone on an open source system might skip.”

The lack of transparency around the protocol had caused concern among privacy experts — and led to calls for developers to withhold support pending more detail. And even to speculation that European governments may be intervening to push the effort towards a centralized model — and away from core EU principles of data protection by design and default.

As it stands, the EU’s long-standing data protection law bakes in principles such as data minimization. Transparency is another core requirement. And just last week the bloc’s lead privacy regulator, the EDPS, told us it’s monitoring developments around COVID-19 contacts tracing apps.

“The EDPS supports the development of technology and digital applications for the fight against the coronavirus pandemic and is monitoring these developments closely in cooperation with other national Data Protection Supervisory Authorities. It is firmly of the view that the GDPR is not an obstacle for the processing of personal data which is considered necessary by the Health Authorities to fight the pandemic,” a spokesman told us.

“All technology developers currently working on effective measures in the fight against the coronavirus pandemic should ensure data protection from the start, e.g. by applying apply data protection by design principles. The EDPS and the data protection community stand ready to assist technology developers in this collective endeavour. Guidance from data protection authorities is available here: EDPB Guidelines 4/2019 on Article 25 Data Protection by Design and by Default; and EDPS Preliminary Opinion on Privacy by Design.”

We also understand the European Commission is paying attention to the sudden crop of coronavirus apps and tools — with effectiveness and compliance with European data standards on its radar.

However, at the same time, the Commission has been pushing a big data agenda as part of a reboot of the bloc’s industrial strategy that puts digitization, data and AI at the core. And just today Euroactiv reported on leaked documents from the EU Council which say EU Member States and the Commission should “thoroughly analyse the experiences gained from the COVID-19 pandemic” in order to inform future policies across the entire spectrum of the digital domain.

So even in the EU there is a high level appetite for data that risks intersecting with the coronavirus crisis to drive developments in a direction that might undermine individual privacy rights. Hence the fierce push back from certain pro-privacy quarters for contacts tracing to be decentralized — to guard against any state data grabs.

For his part Boos argues that what counts as best practice ‘data minimization’ boils down to a point of view on who you trust more. “You could make an argument [for] both [deccentralized and centralized approaches] that they’re data minimizing — just because there’s data minimization at one point doesn’t mean you have data minimization overall in a decentralized system,” he suggests.

“It’s a question who do you trust? It’s who would you trust more — that’s the real question. I see the critical point of data as not the list of anonymized contacts — the critical data is the confirmed infected.

“A lot of this is an old, religious discussion between centralization and decentralization,” he added. “Generally IT oscillates between those tools; total distribution, total centralization… Because none of those is a perfect solution. But here in this case I think both offer valid security options, and then they have both different implications on what you’re willing to do or not willing to do with medical data. And then you’ve got to make a decision.

“What we have to do is we’ve got to make sure that the options are available. And we’ve got to make sure there’s sound research, not just conjecture, in heavyweight discussions: How does what work, how do they compare, and what are the risks?”

In terms of who’s involved in PEPP-PT discussions, beyond direct project participants, Boos said governments and health ministries are involved for the practical reason that they “have to include this in their health processes”. “A lot of countries now create their official tracing apps and of course those should be connected to the PEPP-PT,” he said.

“We also talk to the people in the health systems — whatever is the health system in the respective countries — because this needs to in the end interface with the health system, it needs to interface with testing… it should interface with infectious disease laws so people could get in touch with the local CDCs without revealing their privacy to us or their contact information to us, so that’s the conversation we’re also having.”

Developers with early (beta) access are kicking the tyres of the system already. Asked when the first apps making use of PEPP-PT technologies might be in general circulation Boos suggested it could be as soon as a couple of weeks.

“Most of them just have to put this into their tracing layer and we’ve already given them enough information so that they know how they can connect this to their health processes. I don’t think this will take long,” he said, noting the project is also providing a tracing reference app to help countries that haven’t got developer resource on tap.

“For user engagement you’ll have to do more than just tracing — you’ll have to include, for example, the information from the CDC… but we will offer the skeletal implementation of an app to make starting this as a project [easier],” he said.

“If all the people that have emailed us since last week put it in their apps [we’ll get widespread uptake],” Boos added. “Let’s say 50% do I think we get a very good start. I would say that the influx from countries and I would say companies especially who want their workforce back — there’s a high pressure especially to go on a system that allows international exchange and interoperability.”

On the wider point of whether contacts tracing apps is a useful tool to help control the spread of this novel coronavirus — which has shown itself to be highly infectious, more so than flu, for example — Boos said: “I don’t think there’s much argument that isolating infection is important, the problem with this disease is there’s zero symptoms while you’re already contagious. Which means that you can’t just go and measure the temperature of people and be fine. You actually need that look into the past. And I don’t think that can be done accurately without digital help.

“So if the theory that you need to isolate infection chains is true at all, which many diseases have shown that it is — but each disease is different, so there’s no 100% guarantee, but all the data speaks for it — then that is definitely something that we need to do… The argument [boils down to] if we have so many infected as we currently have, does this make sense — do we not end up very quickly, because the world is so interconnected, with the same type of lockdown mechanism?

“This is why it only makes sense to come out with an app like this when you have broken these R0 values [i.e how many other people one infected person can infect] — once you’ve got it under 1 and got the number of cases in your country down to a good level. And I think that in the language of an infectious disease person this means going back to the approach of containing the disease, rather than mitigating the disease — what we’re doing now.”

“The approach of contact chain evaluation allows you to put better priorities on testing — but currently people don’t have the real priority question, they have a resource question on testing,” he added. “Testing and tracing are independent of each other. You need both; because if you’re tracing contacts and you can’t get tested what’s that good for? So yes you definitely [also] need the testing infrastructure for sure.”

Zoom admits some calls were routed through China by mistake

Hours after security researchers at Citizen Lab reported that some Zoom calls were routed through China, the video conferencing platform has offered an apology and a partial explanation.

To recap, Zoom has faced a barrage of headlines this week over its security policies and privacy practices, as hundreds of millions forced to work from home during the coronavirus pandemic still need to communicate with each other.

The latest findings landed earlier today when Citizen Lab researchers said that some calls made in North America were routed through China — as were the encryption keys used to secure those calls. But as was noted this week, Zoom isn’t end-to-end encrypted at all, despite the company’s earlier claims, meaning that Zoom controls the encryption keys and can therefore access the contents of its customers’ calls. Zoom said in an earlier blog post that it has “implemented robust and validated internal controls to prevent unauthorized access to any content that users share during meetings.” The same can’t be said for Chinese authorities, however, which could demand Zoom turn over any encryption keys on its servers in China to facilitate decryption of the contents of encrypted calls.

Zoom now says that during its efforts to ramp up its server capacity to accommodate the massive influx of users over the past few weeks, it “mistakenly” allowed two of its Chinese datacenters to accept calls as a backup in the event of network congestion.

From Zoom’s CEO Eric Yuan:

During normal operations, Zoom clients attempt to connect to a series of primary datacenters in or near a user’s region, and if those multiple connection attempts fail due to network congestion or other issues, clients will reach out to two secondary datacenters off of a list of several secondary datacenters as a potential backup bridge to the Zoom platform. In all instances, Zoom clients are provided with a list of datacenters appropriate to their region. This system is critical to Zoom’s trademark reliability, particularly during times of massive internet stress.”

In other words, North American calls are supposed to stay in North America, just as European calls are supposed to stay in Europe. This is what Zoom calls its datacenter “geofencing.” But when traffic spikes, the network shifts traffic to the nearest datacenter with the most available capacity.

China, however, is supposed to be an exception, largely due to privacy concerns among Western companies. But China’s own laws and regulations mandate that companies operating on the mainland must keep citizens’ data within its borders.

Zoom said in February that “rapidly added capacity” to its Chinese regions to handle demand was also put on an international whitelist of backup datacenters, which meant non-Chinese users were in some cases connected to Chinese servers when datacenters in other regions were unavailable.

Zoom said this happened in “extremely limited circumstances.” When reached, a Zoom spokesperson did not quantify the number of users affected.

Zoom said that it has now reversed that incorrect whitelisting. The company also said users on the company’s dedicated government plan were not affected by the accidental rerouting.

But some questions remain. The blog post only briefly addresses its encryption design. Citizen Lab criticized the company for “rolling its own” encryption — otherwise known as building its own encryption scheme. Experts have long rejected efforts by companies to build their own encryption, because it doesn’t undergo the same scrutiny and peer review as the decades-old encryption standards we all use today.

Zoom said in its defense that it can “do better” on its encryption scheme, which it says covers a “large range of use cases.” Zoom also said it was consulting with outside experts, but when asked a spokesperson declined to name any.

Bill Marczak, one of the Citizen Lab researchers that authored today’s report, told TechCrunch he was “cautiously optimistic” about Zoom’s response.

“The bigger issue here is that Zoom has apparently written their own scheme for encrypting and securing calls,” he said, and that “there are Zoom servers in Beijing that have access to the meeting encryption keys.”

“If you’re a well-resourced entity, obtaining a copy of the Internet traffic containing some particularly high-value encrypted Zoom call is perhaps not that hard,” said Marcak.

“The huge shift to platforms like Zoom during the COVID-19 pandemic makes platforms like Zoom attractive targets for many different types of intelligence agencies, not just China,” he said. “Fortunately, the company has (so far) hit all the right notes in responding to this new wave of scrutiny from security researchers, and have committed themselves to make improvements in their app.”

Zoom’s blog post gets points for transparency. But the company is still facing pressure from New York’s attorney general and from two class-action lawsuits. Just today, several lawmakers demanded to know what it’s doing to protect users’ privacy.

Will Zoom’s mea culpas be enough?

Google is now publishing coronavirus mobility reports, feeding off users’ location history

Google is giving the world a clearer glimpse of exactly how much it knows about people everywhere — using the coronavirus crisis as an opportunity to repackage its persistent tracking of where users go and what they do as a public good in the midst of a pandemic.

In a blog post today the tech giant announced the publication of what it’s branding ‘COVID-19 Community Mobility Reports‘. Aka an in-house analysis of the much more granular location data it maps and tracks to fuel its ad-targeting, product development and wider commercial strategy to showcase aggregated changes in population movements around the world.

The coronavirus pandemic has generated a worldwide scramble for tools and data to inform government responses. In the EU, for example, the European Commission has been leaning on telcos to hand over anonymized and aggregated location data to model the spread of COVID-19.

Google’s data dump looks intended to dangle a similar idea of public policy utility while providing an eyeball-grabbing public snapshot of mobility shifts via data pulled off of its global user-base.

In terms of actual utility for policymakers, Google’s suggestions are pretty vague. The reports could help government and public health officials “understand changes in essential trips that can shape recommendations on business hours or inform delivery service offerings”, it writes.

“Similarly, persistent visits to transportation hubs might indicate the need to add additional buses or trains in order to allow people who need to travel room to spread out for social distancing,” it goes on. “Ultimately, understanding not only whether people are traveling, but also trends in destinations, can help officials design guidance to protect public health and essential needs of communities.”

The location data Google is making public is similarly fuzzy — to avoid inviting a privacy storm — with the company writing it’s using “the same world-class anonymization technology that we use in our products every day”, as it puts it.

“For these reports, we use differential privacy, which adds artificial noise to our datasets enabling high quality results without identifying any individual person,” Google writes. “The insights are created with aggregated, anonymized sets of data from users who have turned on the Location History setting, which is off by default.”

“In Google Maps, we use aggregated, anonymized data showing how busy certain types of places are—helping identify when a local business tends to be the most crowded. We have heard from public health officials that this same type of aggregated, anonymized data could be helpful as they make critical decisions to combat COVID-19,” it adds, tacitly linking an existing offering in Google Maps to a coronavirus-busting cause.

The reports consist of per country, or per state, downloads (with 131 countries covered initially), further broken down into regions/counties — with Google offering an analysis of how community mobility has changed vs a baseline average before COVID-19 arrived to change everything.

So, for example, a March 29 report for the whole of the US shows a 47 per cent drop in retail and recreation activity vs the pre-CV period; a 22% drop in grocery & pharmacy; and a 19% drop in visits to parks and beaches, per Google’s data.

While the same date report for California shows a considerably greater drop in the latter (down 38% compared to the regional baseline); and slightly bigger decreases in both retail and recreation activity (down 50%) and grocery & pharmacy (-24%).

Google says it’s using “aggregated, anonymized data to chart movement trends over time by geography, across different high-level categories of places such as retail and recreation, groceries and pharmacies, parks, transit stations, workplaces, and residential”. The trends are displayed over several weeks, with the most recent information representing 48-to-72 hours prior, it adds.

The company says it’s not publishing the “absolute number of visits” as a privacy step, adding: “To protect people’s privacy, no personally identifiable information, like an individual’s location, contacts or movement, is made available at any point.”

Google’s location mobility report for Italy, which remains the European country hardest hit by the virus, illustrates the extent of the change from lockdown measures applied to the population — with retail & recreation dropping 94% vs Google’s baseline; grocery & pharmacy down 85%; and a 90% drop in trips to parks and beaches.

The same report shows an 87% drop in activity at transit stations; a 63% drop in activity at workplaces; and an increase of almost a quarter (24%) of activity in residential locations — as many Italians stay at home, instead of commuting to work.

It’s a similar story in Spain — another country hard-hit by COVID-19. Though Google’s data for France suggests instructions to stay-at-home may not be being quite as keenly observed by its users there, with only an 18% increase in activity at residential locations and a 56% drop in activity at workplaces. (Perhaps because the pandemic has so far had a less severe impact on France, although numbers of confirmed cases and deaths continue to rise across the region.)

While policymakers have been scrambling for data and tools to inform their responses to COVID-19, privacy experts and civil liberties campaigners have rushed to voice concerns about the impacts of such data-fuelled efforts on individual rights, while also querying the wider utility of some of this tracking.

Contacts tracing is another area where apps are fast being touted as a potential solution to get the West out of economically crushing population lockdowns — opening up the possibility of people’s mobile devices becoming a tool to enforce lockdowns, as has happened in China.

“Large-scale collection of personal data can quickly lead to mass surveillance,” is the succinct warning of a trio of academics from London’s Imperial College’s Computational Privacy Group, who have compiled their privacy concerns vis-a-vis COVID-19 contacts tracing apps into a set of eight questions app developers should be asking.

Discussing Google’s release of mobile location data for a COVID-19 cause, the head of the group, Yves-Alexandre de Montjoye, gave a general thumbs up to the steps it’s taken to shrink privacy risks.

Although he also called for Google to provide more detail about the technical processes it’s using in order that external researchers can better assess the robustness of the claimed privacy protections. Such scrutiny is of pressing importance with so much coronavirus-related data grabbing going on right now, he argues.

“It is all aggregated, they normalize to a specific set of dates, they threshold when there are too few people and on top of this they add noise to make — according to them — the data differentially private. So from a pure anonymization perspective it’s good work,” de Montjoye told TechCrunch, discussing the technical side of Google’s release of location data. “Those are three of the big ‘levers’ that you can use to limit risk. And I think it’s well done.”

“But — especially in times like this when there’s a lot of people using data — I think what we would have liked is more details. There’s a lot of assumptions on thresholding, on how do you apply differential privacy, right?… What kind of assumptions are you making?” he added, querying how much noise Google is adding to the data, for example. “It would be good to have a bit more detail on how they applied [differential privacy]… Especially in times like this it is good to be… overly transparent.”

While Google’s mobility data release might appear to overlap in purpose with the Commission’s call for EU telco metadata for COVID-19 tracking, de Montjoye points out there are likely to be key differences based on the different data sources.

“It’s always a trade off between the two,” he says. “It’s basically telco data would probably be less fine-grained, because GPS is much more precise spatially and you might have more data points per person per day with GPS than what you get with mobile phone but on the other hand the carrier/telco data is much more representative — it’s not only smartphone, and it’s not only people who have latitude on, it’s everyone in the country, including non smartphone.”

There may be country specific questions that could be better addressed by working with a local carrier, he also suggested. (The Commission has said it’s intending to have one carrier per EU Member State providing anonymized and aggregated metadata.)

On the topical question of whether location data can ever be truly anonymized, de Montjoye — an expert in data reidentification — gave a “yes and no” response, arguing that original location data is “probably really, really hard to anonymize”.

“Can you process this data and make the aggregate results anonymous? Probably, probably, probably yes — it always depends. But then it also means that the original data exists… Then it’s mostly a question of the controls you have in place to ensure the process that leads to generating those aggregates does not contain privacy risks,” he added.

Perhaps a bigger question related to Google’s location data dump is around the issue of legal consent to be tracking people in the first place.

While the tech giant claims the data is based on opt-ins to location tracking the company was fined $57M by France’s data watchdog last year for a lack of transparency over how it uses people’s data.

Then, earlier this year, the Irish Data Protection Commission (DPC) — now the lead privacy regulator for Google in Europe — confirmed a formal probe of the company’s location tracking activity, following a 2018 complaint by EU consumers groups which accuses Google of using manipulative tactics in order to keep tracking web users’ locations for ad-targeting purposes.

“The issues raised within the concerns relate to the legality of Google’s processing of location data and the transparency surrounding that processing,” said the DPC in a statement in February, announcing the investigation.

The legal questions hanging over Google’s consent to track likely explains the repeat references in its blog post to people choosing to opt in and having the ability to clear their Location History via settings. (“Users who have Location History turned on can choose to turn the setting off at any time from their Google Account, and can always delete Location History data directly from their Timeline,” it writes in one example.)

In addition to offering up coronavirus mobility porn reports — which Google specifies it will continue to do throughout the crisis — the company says it’s collaborating with “select epidemiologists working on COVID-19 with updates to an existing aggregate, anonymized dataset that can be used to better understand and forecast the pandemic”.

“Data of this type has helped researchers look into predicting epidemics, plan urban and transit infrastructure, and understand people’s mobility and responses to conflict and natural disasters,” it adds.