A Calculating Look at Criminal Justice

If you’re charged with a crime in the United States and can’t pay bail, a judge will lock you up until your trial to make sure you actually appear. You can go into debt to a bond company to cover the bail, or — guilty or not — risk losing your job, home, and even your children while you wait behind bars for months.

In California, that will soon change. Beginning in October of next year, a law signed by Governor Jerry Brown will replace the bail system with a computer program that can parse your background and determine the likelihood that you will flee if released. Judges will use a resulting flight-risk and public safety-risk “score” to determine whether to keep you jailed, or let you free while you await trial.

The new law is supposed to help remove biases in the bail system, which mostly harms poor people, and it’s part of a growing trend in the use of software in the day-to-day machinery of the justice system. In the United States alone, courts already have at least 60 such programs in use in different jurisdictions that assess, for example, the risk that someone will follow the rules before their trial or commit a crime if they’re released. Some of these algorithms are relatively simple, while others use complex combinations of data beyond criminal history, including gender, age, zip code, and parents’ criminal backgrounds, as well as information from collections agencies, social media, utility bills, camera feeds, and even call logs from pizza chains.

As the criminal justice system becomes more automated and digitized, police officers, prosecutors, and judges have increasingly massive data sets at their fingertips. The problem, as many critics have repeatedly argued, is that the algorithms that parse, interpret, and even learn from all this data may themselves be biased — both in how they are built and how the courts wield them. Judges, for example, only rely on computer programs “when they like the answer” it gives, says Margaret Dooley-Sammuli of the American Civil Liberties Union (ACLU), which, despite early support, opposed the California bill.

Preliminary data bear this out: Judges don’t always follow the algorithms’ recommendations, often detaining people despite low risk scores, according to analysts at Upturn, a Washington, D.C. nonprofit. And ongoing research — including work from the University of Texas at Austin and Stanford University that focuses on the use of algorithms in the Los Angeles Police Department and criminal courts, respectively — adds to these troubling hints of bias.

“Risk assessment tools are used at every single step of the criminal justice system,” says Angèle Christin, a Stanford sociologist, and “predictive tools build on top of each other.” This suggests that in California and beyond, these layered biases could become more difficult to observe, which would in turn make it harder to police how the criminal justice system uses the tools.

An algorithm — essentially a set of commands that tells a computer what to do — is only as good as the data it pulls from. In order to get a close look at police data collection at the ground level, Sarah Brayne, a sociologist at UT Austin, embedded with the LAPD — a department that, along with Chicago and New York, leads the way in harnessing surveillance tools, big data, and computer algorithms.

As a sociology Ph.D. student at Princeton University and a post-doctoral student at Microsoft Research, Brayne shadowed the police officers between 2013 and 2015 and observed them both in the precinct and on ride-alongs. This field work, combined with 75 interviews, helped tease out how the department uses data in daily operations. The access was unprecedented, says Andrew Ferguson, a law professor at the University of the District of Columbia and author of the book, “The Rise of Big Data Policing: Surveillance, Race, and the Future of Law Enforcement.” “I’m sure they’ll never make that mistake again,” he adds.

Police departments’ use of predictive software falls into two broad categories: The first is place-based policing, which uses past crime data to redirect police patrols to 500-square-foot “hot spots” that are forecast to a higher crime risk. For this, the LAPD uses a program from PredPol, one of the largest predictive policing companies in the U.S. The second is person-based policing, where the police generate a ranked list of “chronic offenders” or “anchor points” — with the “hottest” individuals expected to commit the most crime. For these applications, the LAPD uses Operation Laser, based in part on software developed by Palantir Technologies, which was co-founded in 2003 by the billionaire venture capitalist and entrepreneur Peter Thiel.

Brayne expected the LAPD to embrace the new technologies and surveillance. “I came into it thinking, information is power,” she says. But it turned out that individual officers didn’t always collect all the data. Since body cameras and GPS, among other tools, could be used to monitor the cops’ own activities, it made them nervous. For example, “all cars are equipped with automatic vehicle locators, but they weren’t turned on because they’re resisted by the police officers’ union,” Brayne says. “Police officers don’t want their sergeants to see, oh, they stopped at Starbucks for 20 minutes.” (Brayne says the locators have since then been turned on, at least in the LAPD’s central bureau.)

Even when the police do collect the data, bias can still sneak in. Take Operation Laser. The system originally gave people points for things like prior arrests and for every police contact, moving them up the list. This was a flaw, says Ferguson: “Who are the police going to target when they contact the people with the most points? The ones they’ve contacted. They’ve literally created a self-fulfilling prophecy.”

There are some efforts to prevent these biases, however. The LAPD is tinkering with Laser “since it turned out to be subjective and there was no consistency in what counts as a ‘quality’ contact,” says LAPD Deputy Chief Dennis Kato. “Now, we’re not going to assign points for [police] contacts at all.” The LAPD also reevaluates Laser zones every six months to decide if certain locations no longer need extra police attention. “It’s never the case that a computer spits out something and a human blindly follows it,” Kato says. “We always have humans making the decisions.”

In other cases, the ground-level data collection and how it is used remain a black box. Most risk assessment algorithms used in courts, for example, remain proprietary and are unavailable to defendants or their attorneys.

Some hints come from one publicly available software package called the Public Safety Assessment, created by the Texas-based foundation of billionaires Laura and John Arnold, which is used in cities and states across the country, though not L.A. But even this level of transparency doesn’t clarify exactly what factors most affect a risk score and why, nor does it reveal what data an algorithm was trained on. In some cases, the simple fact of being 19 years old appears to weigh as much as three assaults and domestic violence counts. And if single-parent households or over-policed communities factor into the risk calculation, black defendants are often disproportionately labeled as high risk.

“You have this tool that holds a mirror up to the past in order to predict the future,” says Megan Stevenson, an economist and legal scholar at George Mason University’s Antonin Scalia Law School in Arlington, Virginia. “If the past contains racial bias and histories of economic and social disadvantage that are correlated with crime,” she says, “people are concerned that they’re either going to embed or exacerbate race and class disparities.”

And if a person is labeled high-risk by an algorithm, it could follow them through pretrial and, if they are convicted, sentencing or parole.

“We were concerned because any time you’re using a generalized tool to decide something, you run the risk of a cookie-cutter approach,” says San Francisco Public Defender Jeff Adachi. “Some would argue that that’s what we’re trying to work toward in criminal justice, where everyone’s going to be treated the same, but even that statement is subjective.” (The San Francisco and L.A. District Attorney’s offices both declined interview requests.)


Between 2015 and 2016, Christin, the Stanford sociologist, conducted her own fieldwork, which included interviews with 22 judges, attorneys, probation officers, clerks, and technology developers at three randomly chosen American criminal courts in California, on the East Coast, and in the southern U.S. Christin found that while some American judges and prosecutors closely followed the tool’s recommendations, others ignored them. On seeing the printed pages of a software package’s results in defendants’ files, one prosecutor told her: “I didn’t put much stock in it.” The judges she spoke to also preferred to rely on their own experience and discretion. “I think that’s interesting,” Christin says, “because it says something about how the tools can be used differently from the way that people who built them were thinking.”

(Brayne and Christin are now combining their research and preparing for submission to a peer-reviewed journal.)

When it comes to pretrial risk assessment tools like the ones that Gov. Brown plans to introduce in California, the track records are also mixed. Mandatory pretrial algorithms in Kentucky, which started in 2011, were supposed to increase efficiency by keeping more people who would have committed crimes in jail and releasing those who were low-risk. But the risk assessment tools didn’t deliver, according to work by Stevenson. The fraction of people detained before trial dropped by only 4 percentage points and later drifted back up. Slightly more people failed to appear for their trials, and pretrial arrests remained the same. Stevenson also points out that most judges are elected, which creates an incentive to keep people in jail. If someone they released goes on to commit a crime, there may be political blowback, while detaining a person who possibly didn’t need to be won’t likely affect the judge’s reelection.

Still, Brayne and Christin both said they expect that more data from more sources will be collected and processed automatically — and behind the scenes — in coming years. Police officers may have risk scores and maps pop up on their dashboards, while judges will have risk assessments for everyone at every step and for every kind of crime, giving the impression of precision. As it stands, however, any imprecisions or biases that point police toward you or your zip code are only likely to be amplified as one new software package is built upon the next. And current laws, including California’s bail reform, don’t provide detailed regulations or review of how police and courtroom algorithms are used.

The computer programs are moving too fast for watchdogs or practitioners to figure out how to apply them fairly, Christin says. But while the technology may appear more “objective and rational” so that “discretionary power has been curbed,” she adds, “in fact, usually it’s not. It’s just that power moves through a new place that may be less visible.”

Ramin Skibba (@raminskibba) is an astrophysicist turned science writer and freelance journalist who is based in San Diego. He has written for The Atlantic, Slate, Scientific American, Nature, and Science, among other publications.

Ramin Skibba (@raminskibba) is an astrophysicist turned science writer and freelance journalist who is based in the Bay Area. He has written for WIRED, The Atlantic, Slate, Scientific American, and Nature, among other publications.