AI in Public Safety & Surveillance

The Ledger: The Real Good, the Real Harm, and the Truth About “Accuracy”

The honest case for AI in public safety is a balance sheet, not a sales pitch. One side: abducted children recovered through plate-reader hits, trafficking victims found faster, missing people located in hours. The other: at least six Americans wrongfully arrested on bad facial-recognition matches, and “97% accurate” claims that collapse under NIST testing showing error rates swinging 10–100x across demographics and thresholds. This essay puts both columns on one page, separates verified outcomes from vendor marketing, and argues the strongest defense of the technology is the one that admits its failures and fixes them — a match is a lead, not probable cause.

By Tom Hanks· June 2026· 11 min read

There are two ways this technology gets sold to you, and both are a kind of lie by omission.

The first is the brochure. A child is taken; a camera reads the plate; an officer is at the scene in minutes; the child comes home. Roll credits. The second is the protest sign. A man is pulled from his family in his own driveway, jailed for days, for a crime committed by someone a face-matching algorithm decided looked enough like him. Both of these things happen. Both are true. And anyone who shows you only one column is selling something — a product or a panic.

So let's do the thing neither the vendor nor the activist wants to do. Let's put both columns on the same page and add them up honestly. Because the honest case for AI in public safety isn't a pitch and it isn't a scandal. It's a ledger.

The credit column: this is not hypothetical

Start with the wins, and resist the urge to wave them away, because they are real and they have names.

In September 2025, a 16-year-old at the center of an AMBER Alert was forced into a truck in Colorado. A Flock license-plate reader caught the vehicle, and officers found her safe. Boulder's police chief, Stephen Redfearn, put it plainly: "Within minutes, having an idea of where that vehicle was recently is so important."¹ Speed is the whole game in an abduction, and a network that turns a plate into a location in minutes is doing something a week of detective work used to do too late.

In November 2025, an 86-year-old man with dementia named Donald Keaton went missing in Florida. K-9 units and a helicopter found nothing across nearly four days. A drone found his heat signature under brush. The sheriff did not hedge: "Without this technology, he would not be alive today."² That same month, in Texas, a drone with a thermal camera located another missing man with dementia in about ten minutes.³ If you have ever waited on news of a wandering parent, you do not need the value of that explained.

These are the cases that survive scrutiny, because each has a named official on the record and a verifiable outcome. I'd ask you to hold them firmly, because the moment we get to the harm column, the temptation will be to forget them — and forgetting them is its own dishonesty.

A necessary footnote on the brochure

Now, the discipline this essay is built on cuts both ways, so here's where the credit column gets oversold. Flock Safety announced that its technology helped recover "six abducted children in five months" in Colorado.⁴ It may well have. But that figure comes from the company's own press release, and only one of those cases — the Boulder recovery above — is independently confirmed in news reporting. A vendor's aggregate is a starting point for verification, not the end of it.

The same company once publicized a study claiming its cameras help solve "10% of reported crime in the U.S."⁵ It's a stunning number. It's also one the academic who oversaw the study walked away from, telling reporters the underlying data was "too varied and incomplete... to do any type of meaningful statistical analysis."⁶ When the researcher you hired distances himself from your headline, the headline is marketing. Keep the recovered child in the credit column. Strike the round, unaudited percentages. That distinction — verified outcome versus vendor math — is the entire skill.

The number that decides everything

Here we reach the turn, and it hangs on a single comforting belief: the vendor says the system is 97% — or 99% — accurate, so a match is reliable enough to act on. It's an entirely reasonable thing to think. We trust a 99% number everywhere else in life; a test that's right 99 times in 100 sounds like a test you can build a case on.

But sit with a few questions before you bank on it. Accurate at what, exactly — matching two clean passport photos, or matching a grainy ATM still to a database of fifteen million faces? Accurate for whom — does the 99% hold for everyone, or does it quietly sag for some faces and not others? And accurate at which setting — because every one of these systems has a dial that trades false alarms against missed matches, and "97% accurate" never tells you where the dial was when they measured?

When you pull on those questions, the comforting number comes apart in your hands. The U.S. government's own testing lab, NIST, ran the most rigorous evaluation we have and found that across demographic groups, false-positive rates "often vary by factors of 10 to beyond 100 times."⁷ Read that again: not 10 to 100 percent — 10 to 100 times. The same algorithm that almost never falsely matches one kind of face will falsely match another kind of face a hundred times as often, with the highest false-positive rates landing on West and East African and East Asian faces, and on women more than men.⁷ NIST also called out the specific magic trick behind the marketing: vendors report accuracy "at fixed false-positive rates rather than at fixed thresholds, thereby hiding excursions in false positive rates."⁷ In plain English, the headline number is measured under conditions the street never matches.

The lab-versus-life gap is the heart of it. Georgetown Law's Center on Privacy & Technology documented police feeding these systems exactly the kind of input that guarantees garbage out — forensic sketches instead of photos, celebrity look-alikes standing in for suspects, digitally edited images — and then treating whatever name came back as a lead worth an arrest.⁸ A facial-recognition score is like a breathalyzer reading taken after the machine's been dropped down the stairs: the device might be excellent, but feed it the conditions of an actual investigation and the number on the screen means far less than the number in the brochure. To be fair — and NIST is — the best algorithms have narrowed these gaps, and a few show no meaningful differential at all.⁷ The technology is not hopeless. The headline is.

The debit column: this also has names

Which brings us to the cost, and the people who paid it.

Robert Williams was arrested in his driveway in front of his daughters in January 2020, after Detroit police matched a blurry surveillance image of a shoplifter to his expired driver's-license photo. He spent roughly thirty hours in a cell for a crime he plainly didn't commit. His case became the first publicly documented wrongful arrest by facial recognition in America — and in June 2024 it ended in a settlement that forced Detroit to adopt some of the strongest police limits on the technology in the country.⁹ Detroit alone produced two more: Michael Oliver, matched for a phone-grab despite full tattoo sleeves the actual suspect didn't have;¹⁰ and Porcha Woodruff, arrested for carjacking while eight months pregnant, who had contractions in a holding cell.¹¹ All three were Black. All three were matched by the same class of system.

It wasn't only Detroit. Randal Quran Reid was jailed for six days in 2022 over a Clearview AI match to a Louisiana crime he could not have committed — he had never set foot in the state — and Jefferson Parish later settled with him for $200,000.¹² Nijeer Parks spent ten days in jail in New Jersey for a crime that happened thirty miles from where he was, with an alibi the whole time.¹³

Your map of this problem is probably out of date, and that's the most important sentence in this essay. As recently as early 2024, the documented count of these wrongful arrests was about six. By June 2026, the ACLU — which tracks them case by case — put the number at at least 15 known cases, spanning ten states, and "known" is doing real work in that sentence, because these are only the ones that surfaced.¹⁴

Dumbbell chart titled "The debit column is growing," showing publicly known U.S. wrongful arrests from facial-recognition misidentification rising from about 6 in early 2024 to at least 15 by mid-2026, per ACLU tracking, with a note that known cases are almost certainly an undercount.

The honest reading isn't that the harm is rampant against the millions of searches run. It's that it is real, it is growing, it falls hardest on people who look a particular way, and we are almost certainly undercounting it.

And the ledger insists on honesty in this column too. Williams won a settlement that changed Detroit's rules; the system showed it can correct itself. Woodruff's own lawsuit was later dismissed by a federal judge.¹¹ The harm is real and the legal picture is messier than either side's slogan. Both of those belong on the page.

What the ledger actually adds up to

So total the columns and you get something more useful than a verdict. You get a rule.

Every serious body that has looked at this — the International Association of Chiefs of Police among them — has reached the same conclusion and written it down: a facial-recognition result is "used for lead generation purposes only," a clue and nothing more.¹⁵ Detroit's own policy, in capital letters, says a match "IS NOT TO BE CONSIDERED A POSITIVE IDENTIFICATION."¹⁵ The vendors print the same disclaimer on their own reports. On paper, everyone agrees: a match is a lead, not probable cause.

The trouble is what happens after the paper. The ACLU's review of these cases found the warning fails in practice with grim reliability: an officer gets a name from the algorithm, builds a photo lineup around that one face, and a witness — shown a suspect the machine already fingered — points to him. The "lead" launders itself into an identification, and the disclaimer in the manual never slows it down.¹⁶ A lead-only rule that no one is forced to follow is a seatbelt that unbuckles itself in the crash.

Notice what that means, though, because it's the hopeful part. Nearly every wrongful arrest in the debit column is not a story about a camera that was evil. It's a story about a lead that was treated as a conclusion — a discipline failure, not a technology failure. The plate reader that found the abducted teenager and the face-match that jailed Robert Williams are the same kind of tool used with opposite levels of rigor. The credit column and the debit column are separated less by the hardware than by whether a human being did the work the machine was never supposed to replace.

That's the altitude I'd leave you on. The fight worth having isn't "surveillance: yes or no," which is the fight both brochures and protest signs want you to have, because it's loud and it never ends. The fight worth having is over the rule between the match and the cuffs — independent corroboration before an arrest, real consequences when it's skipped, and an honest count of the times it goes wrong. I might be too optimistic that we'll adopt that discipline before the case count climbs further; the record so far is mixed. I'm not at all uncertain about the standard itself. A match is a lead. Treat it as anything more, and you've taken a tool that brings children home and pointed it at the wrong man's driveway.

A disclosure, because this is my field: I'm Chief Product Officer at IREX, a company that builds AI for public safety, and the views here are my own, not the company's. I'll tell you plainly where that leaves me — not as a booster, but as someone with a stake in seeing this done right: the surest way to lose a genuinely useful technology is to let its worst uses go unexamined until someone bans the whole thing in disgust. Keep both columns. Add them up in daylight. That's not the enemy of public safety — it's the only version of it that lasts.

References & Sources

Superscript numbers in the text correspond to the numbered sources below. Figures are original graphics by the author, built from the cited data. Vendor-reported figures are labeled as claims, not audited facts.

Denver7 (KMGH), "Boulder police credit Flock license plate readers for helping officers find teen at center of AMBER Alert" (Sept 2025). denver7.com
WPTV (NBC), "86-year-old man with dementia rescued after going missing for nearly 4 days" (Nov 2025) — drone thermal detection; quote from Sheriff Eric Flowers. wptv.com
KPRC Click2Houston, "Drone thermal technology helps Sugar Land police quickly find missing man with dementia" (Nov 2025). click2houston.com
Flock Safety, "Six Abducted Children Recovered in Five Months in Colorado With Flock License Plate Reader Technology," press release via GlobeNewswire (Apr 2026) — vendor claim; only the Boulder case is independently confirmed. globenewswire.com
Flock Safety, "New Study Finds that Flock Safety is Instrumental in Solving 10% of Reported Crime in U.S.," press release via GlobeNewswire (Feb 2024) — vendor claim. globenewswire.com
404 Media, "Researcher Who Oversaw Flock Surveillance Study Now Has Concerns About It" (Mar 2024) — TCU's Johnny Nhan distances himself from the 10% figure. 404media.co
NIST, "Face Recognition Vendor Test (FRVT) Part 3: Demographic Effects," NISTIR 8280 (Dec 2019) — false-positive rates vary by 10 to 100+ times across demographics; threshold-reporting critique. (NIST's program is now FRTE; top algorithms have since narrowed gaps.) nvlpubs.nist.gov
Georgetown Law Center on Privacy & Technology, Clare Garvie, "Garbage In, Garbage Out: Face Recognition on Flawed Data" (May 2019). law.georgetown.edu
ACLU, "Williams v. City of Detroit" case page and June 2024 settlement; original reporting: Kashmir Hill, "Wrongfully Accused by an Algorithm," The New York Times (June 2020). aclu.org · nytimes.com
ACLU, "ACLU Statement on Second Wrongful Arrest Due to Face Recognition Technology" (July 2020); Detroit Free Press coverage of Michael Oliver. aclu.org
Kashmir Hill, "Eight Months Pregnant and Arrested After False Facial Recognition Match," The New York Times (Aug 2023); lawsuit later dismissed, per CBS News Detroit. nytimes.com · cbsnews.com
NOLA.com / The Times-Picayune, "Facial recognition led to wrongful arrest; Jefferson Parish settles for $200,000" (June 2025); original reporting: Kashmir Hill, NYT (Mar 2023). Randal Quran Reid, matched via Clearview AI. nola.com
Kashmir Hill, "Another Arrest, and Jail Time, Due to a Bad Facial Recognition Match," The New York Times (Dec 2020) — Nijeer Parks. nytimes.com
ACLU, "More than a Dozen Wrongful Arrests Due to Police Reliance on Facial Recognition Technology" (Apr 2026), updated with the June 2026 Dillon case — at least 15 known cases across ten states. aclu.org
International Association of Chiefs of Police / IJIS, "Guiding Principles for Law Enforcement's Use of Facial Recognition Technology" (Oct 2019) — results for lead generation only; Detroit PD policy (Sept 2019) — a match "IS NOT TO BE CONSIDERED A POSITIVE IDENTIFICATION." theiacp.org
ACLU, Nathan Freed Wessler, "Police Say a Simple Warning Will Prevent Face Recognition Wrongful Arrests. That's Just Not True." (Apr 2024). aclu.org

← All writing