Dirty Data, Bad Predictions: How Civil Rights Violations Impact Police Data, Predictive Policing Systems, and Justice

Source: Rashida Richardson, Jason Schultz, Kate Crawford, New York University Law Review Online, Forthcoming, February 13, 2019

From the abstract:
Law enforcement agencies are increasingly using algorithmic predictive policing systems to forecast criminal activity and allocate police resources. Yet in numerous jurisdictions, these systems are built on data produced within the context of flawed, racially fraught and sometimes unlawful practices (‘dirty policing’). This can include systemic data manipulation, falsifying police reports, unlawful use of force, planted evidence, and unconstitutional searches. These policing practices shape the environment and the methodology by which data is created, which leads to inaccuracies, skews, and forms of systemic bias embedded in the data (‘dirty data’). Predictive policing systems informed by such data cannot escape the legacy of unlawful or biased policing practices that they are built on. Nor do claims by predictive policing vendors that these systems provide greater objectivity, transparency, or accountability hold up. While some systems offer the ability to see the algorithms used and even occasionally access to the data itself, there is no evidence to suggest that vendors independently or adequately assess the impact that unlawful and bias policing practices have on their systems, or otherwise assess how broader societal biases may affect their systems.

In our research, we examine the implications of using dirty data with predictive policing, and look at jurisdictions that (1) have utilized predictive policing systems and (2) have done so while under government commission investigations or federal court monitored settlements, consent decrees, or memoranda of agreement stemming from corrupt, racially biased, or otherwise illegal policing practices. In particular, we examine the link between unlawful and biased police practices and the data used to train or implement these systems across thirteen case studies. We highlight three of these: (1) Chicago, an example of where dirty data was ingested directly into the city’s predictive system; (2) New Orleans, an example where the extensive evidence of dirty policing practices suggests an extremely high risk that dirty data was or will be used in any predictive policing application, and (3) Maricopa County where despite extensive evidence of dirty policing practices, lack of transparency and public accountability surrounding predictive policing inhibits the public from assessing the risks of dirty data within such systems. The implications of these findings have widespread ramifications for predictive policing writ large. Deploying predictive policing systems in jurisdictions with extensive histories of unlawful police practices presents elevated risks that dirty data will lead to flawed, biased, and unlawful predictions which in turn risk perpetuating additional harm via feedback loops throughout the criminal justice system. Thus, for any jurisdiction where police have been found to engage in such practices, the use of predictive policing in any context must be treated with skepticism and mechanisms for the public to examine and reject such systems are imperative.