Feb. 5, 2021, 5:00 a.m. ET
In 2019, a source came to us with a digital file containing the precise locations of more than 12 million individual smartphones for several months in 2016 and 2017. The data is supposed to be anonymous, but it isn’t. We found celebrities, Pentagon officials and average Americans.
It became clear that this data — collected by smartphone apps and then fed into a dizzyingly complex digital advertising ecosystem — was a liability to national security, to free assembly and to citizens living mundane lives. It provided an intimate record of people whether they were visiting drug treatment centers, strip clubs, casinos, abortion clinics or places of worship.
Surrendering our privacy to the government would be foolish enough. But what is more insidious is the Faustian bargain made with the marketing industry, which turns every location ping into currency as it is bought and sold in the marketplace of surveillance advertising.
Now, one year later, we’re in a very similar position. But it’s far worse.
A source has provided another data set, this time following the smartphones of thousands of Trump supporters, rioters and passers-by in Washington, D.C., on January 6, as Donald Trump’s political rally turned into a violent insurrection. At least five people died because of the riot at the Capitol. Key to bringing the mob to justice has been the event’s digital detritus: location data, geotagged photos, facial recognition, surveillance cameras and crowdsourcing.
From Trump’s Rally to Congress
This time-lapse animation shows smartphones as they moved from Donald Trump’s rally to the Capitol.
attend the protest
Satellite Imagery: Microsoft Corporation, Maxar.
The sacking of the Capitol was a shocking assault on the republic and an unwelcome reminder of the fragility of American democracy. But history reminds us that sudden events — Pearl Harbor, the Soviet Union testing an atomic bomb, the Sept. 11 attacks — have led to an overreach in favor of collective security over individual liberty that we’d later regret. And more generally, the data collected on Jan. 6 is a demonstration of the looming threat to our liberties posed by a surveillance economy that monetizes the movements of the righteous and the wicked alike.
The data we were given showed what some in the tech industry might call a God-view vantage of that dark day. It included about 100,000 location pings for thousands of smartphones, revealing around 130 devices inside the Capitol exactly when Trump supporters were storming the building.
About 40 percent of the phones tracked near the rally stage on the National Mall during the speeches were also found in and around the Capitol during the siege — a clear link between those who’d listened to the president and his allies and then marched on the building.
While there were no names or phone numbers in the data, we were once again able to connect dozens of devices to their owners, tying anonymous locations back to names, home addresses, social networks and phone numbers of people in attendance. In one instance, three members of a single family were tracked in the data.
The source shared this information, in part, because the individual was outraged by the events of Jan. 6. The source wanted answers, accountability, justice. The person was also deeply concerned about the privacy implications of this surreptitious data collection. Not just that it happens, but also that most consumers don’t know it is being collected and it is insecure and vulnerable to law enforcement as well as bad actors — or an online mob — who might use it to inflict harm on innocent people. (The source asked to remain anonymous because the person was not authorized to share the data and could face severe penalties for doing so.)
“What if instead of going to you, I wanted to publish it myself?” the source told us. “What if I were vengeful? There’s nothing preventing me from doing that. It’s totally available. If I had different motives, all it would take is a few clicks, and everyone could see it.”
There is an argument to be made that this data could be properly used by law enforcement through courts, warrants and subpoenas. We used it ourselves as a journalistic tool to bring you this article. But to think that the information will be used against individuals only if they’ve broken the law is naïve; such data is collected and remains vulnerable to use and abuse whether people gather in support of an insurrection or they justly protest police violence, as happened in cities across America last summer.
The data presented here is a bird’s-eye view of an event that posed a clear and grave threat to our democracy. But it tells a second story as well: One of a broken, surreptitious industry in desperate need of regulation, and of a tacit agreement we’ve entered into that threatens our individual privacy. None of this data should ever have been collected.
This is Ronnie Vincent.
We traced a phone inside the Capitol to Mr. Vincent’s home in Kentucky. Confirming his identity led us to his Facebook page, where we found a few photos of him standing on the steps of the building during the siege. Another photo shows a crowd standing in front of the Capitol, its doors wide open.
A phone “ping” tied
to Ronnie Vincent
A phone “ping” tied
to Ronnie Vincent
“Yes we got inside. One girl was shot by the DC cops as she was knocking on the glass. She probably will die. We stopped the voting in the house,” he wrote.
Shortly after he posted the photos, Mr. Vincent, a pest control business owner in Kentucky who goes by the nickname Ole Woodsman, took them down. When we reached him by phone, he insisted he never entered the Capitol.
“There is no way that my phone shows me in there,” he said. Yet it did.
For all its appearance of omniscience, the data can be imprecise. In a situation such as the Capitol riot, exact locations matter. A few feet can be the difference between a participant who committed a serious crime and an onlooker.
While some location data is accurate to within a few feet, other data is not. Location companies can work with data derived from GPS sensors, Bluetooth signals and other sources. The quality depends on the settings of the phone and whether it is connected to Wi-Fi or a cell tower. Issues like population and building density can sometimes play a role in the quality of the data.
Mr. Vincent told us that when he wrote “we got inside,” he meant “we the people got in.”
He added, “I did not go in.”
Can we say definitively Mr. Vincent was inside the Capitol on Jan. 6? No, and that is one of the problems with this type of data.
The trip to Washington, D.C.
Home location has been obscured.
The day of the protest
Around the Capitol
Building at 3:00 p.m.
While the power and scope of this commercial surveillance come into sharp focus when we look at the specific time of the attack on the Capitol, it’s important to remember that it is recording the movements of millions of Americans all day, all night, all year, wherever they are.
The data set Times Opinion examined shows how Trump supporters traveled from South Carolina, Florida, Ohio and Kentucky to the nation’s capital, with pings tracing neatly along major highways, in the days before the attack. Stops at gas stations, restaurants and motels dot the route like bread crumbs, each offering corroborating details.
In many cases, these trails lead from the Capitol right back to their homes.
In the hands of law enforcement, this data could be evidence. But at every other moment, the location data is reviewed by hedge funds, financial institutions and marketers, in an attempt to learn more about where we shop and how we live.
Unlike the data we reviewed in 2019, this new data included a remarkable piece of information: a unique ID for each user that is tied to a smartphone. This made it even easier to find people, since the supposedly anonymous ID could be matched with other databases containing the same ID, allowing us to add real names, addresses, phone numbers, email addresses and other information about smartphone owners in seconds.
The IDs, called mobile advertising identifiers, allow companies to track people across the internet and on apps. They are supposed to be anonymous, and smartphone owners can reset them or disable them entirely. Our findings show the promise of anonymity is a farce. Several companies offer tools to allow anyone with data to match the IDs with other databases.
The “anonymous” mobile advertising ID can be matched across databases, creating a new deanonymized database.
Mobile Ad ID
Mobile Ad ID
Mobile Ad ID
Mobile Ad ID
We were quickly able to match more than 2,000 supposedly anonymous devices in the data set with email addresses, birthdays, ethnicities, ages and more.
One location data company, Cuebiq, publishes a list of customers that may receive the ID with precise smartphone locations. Companies listed there include household names like Adobe and Google, alongside a litany of lesser-known upstarts, like Hivestack, Mogean, Pelmorex and Ubimo.
In an emailed statement, Cuebiq said it prohibits attempts to merge location data with personally identifiable information and requires customers to undergo yearly third-party audits.
Smartphone users will never know if they are included in the data or whether their precise movements were sold. There are no laws forcing companies to disclose what the data is used for or for how long. There are no legal requirements to ever delete the data. Even if anyone could figure out where records of their locations were sold, in most states, you can’t request that the data be deleted.
Their movements could be bought and sold to innumerable parties for years. And the threat that those movements could be tied back to their identity will never go away.
If the Jan. 6 rioters didn’t know before, they surely know now the cost of leaving a digital footprint. Tip lines at the Federal Bureau of Investigation have been flooded for weeks in an effort to identify participants, and detectives in Miami and other police departments are using facial recognition software. Amateur investigators on TikTok, Instagram and other platforms have launched their own identification efforts.
Law enforcement has used cellphone footage from the siege to identify participants. As of February 4, there were 181 federal cases pending against individuals involved in the Capitol Hill siege, according to an analysis by George Washington University’s program on extremism. Affidavits show that federal investigators were easily able to cross-reference footage with public social media posts.
A leak of data from the social media platform Parler also helped investigators and journalists place rioters in the building, using posts that were geotagged with GPS location data. For some, like 38-year-old Oath Keepers member Jessica Watkins, there was no need for precise location data. Her words tell the story: “Yeah. We stormed the Capitol today. Teargassed, the whole, 9. Pushed our way into the Rotunda. Made it into the Senate even,” she wrote on Parler.
Which is to say that law enforcement may not need this data. But as a recent Wall Street Journal report shows, military agencies use these data sets — without a warrant, no less. How? They purchase it. Because we have seen what’s in the data, that revelation is deeply troubling.
While some Americans might cheer the use of location databases to identify Trump supporters who converged on the Capitol, the use of commercial databases has worrying implications for civil liberties. The American criminal justice system is set up for a judge or jury to determine whether, in fact, Ronnie Vincent broke any laws on Jan. 6. But the data leads us directly to him, and in the hands of law enforcement officials — or rogue employees of the company that collected the data — it could narrow their search for participants and offer clues about their activity.
To focus attention only on those people present at the deadly sacking of the Capitol is to lose sight of the larger context of the campaign of incitement and lies from Mr. Trump, right-wing media and members of Congress that set the stage for it. Just as focusing on the movements of Mr. Vincent’s cellphone is to lose sight of the larger surveillance ecosystem that he — and all of us — are trapped in.
The location-tracking industry exists because those in power allow it to exist. Plenty of Americans remain oblivious to this collection through no fault of their own. But many others understand what’s happening and allow it anyway. They feel powerless to stop it or were simply seduced by the conveniences afforded in the trade-off. The dark truth is that, despite genuine concern from those paying attention, there’s little appetite to meaningfully dismantle this advertising infrastructure that undergirds unchecked corporate data collection.
This collection will only grow more sophisticated. This new data set offers proof that not only is there more interest in location data than before, but it is also easier to deanonymize. It gets easier by the day. As the data from Jan. 6 eerily demonstrates, it does not discriminate. It harvests from the phones of MAGA rioters, police officers, lawmakers and passers-by. There is no evidence, from the past or current day, that the power this data collection offers will be used only to good ends. There is no evidence that if we allow it to continue to happen, the country will be safer or fairer.
In our previous investigation, we wrote that Americans deserve the freedom to choose a life without surveillance and the government regulation that would make that possible. While we continue to believe the sentiment, we fear it may soon be obsolete or irrelevant. We deserve that freedom, but the window to achieve it narrows a little more each day. If we don’t act now, with great urgency, it may very well close for good.