Cover: Analysis of Racial Disparities in the New York Police Department's Stop, Question, and Frisk Practices

Analysis of Racial Disparities in the New York Police Department's Stop, Question, and Frisk Practices

Published Nov 9, 2007

by Greg Ridgeway


Download eBook for Free

Full Document

FormatFile SizeNotes
PDF file 1.3 MB

Use Adobe Acrobat Reader version 10 or higher for the best experience.

Summary Only

FormatFile SizeNotes
PDF file 0.1 MB

Use Adobe Acrobat Reader version 10 or higher for the best experience.


Purchase Print Copy

 Format Price
Add to Cart Paperback80 pages $24.00

Research Questions

  1. Does the racial distribution of the pedestrian stops by New York Police Department officers suggest racial bias?
  2. Do certain officers disproportionately stop nonwhites?
  3. Are there racial differences after the stops?

In 2006, the New York City Police Department (NYPD) stopped a half-million pedestrians for suspected criminal involvement. Raw statistics for these encounters suggest large racial disparities — 89 percent of the stops involved nonwhites. Do these statistics point to racial bias in police officers’ decisions to stop particular pedestrians? Do they indicate that officers are particularly intrusive when stopping nonwhites? The NYPD asked the RAND Center on Quality Policing (CQP) to help it understand this issue and identify recommendations for addressing potential problems. CQP researchers analyzed data on all street encounters between NYPD officers and pedestrians in 2006. They compared the racial distribution of stops to external benchmarks, attempts to construct what the racial distribution of the stopped pedestrians would have been if officers’ stop decisions had been racially unbiased. Then they compared each officer’s stopping patterns with an internal benchmark constructed from stops in similar circumstances made by other officers. Finally, they examined stop outcomes, assessing whether stopped white and nonwhite suspects have different rates of frisk, search, use of force, and arrest. They found small racial differences in these rates and make communication, recordkeeping, and training recommendations to the NYPD for improving police-pedestrian interactions.

Key Findings

External Benchmark Analyses

  • Evaluating racial disparities in pedestrian stops using external benchmarks is highly sensitive to the choice of benchmark. Therefore, analyses based on any of the external benchmarks developed to date are questionable.
  • Benchmarks based on crime-suspect descriptions may provide a good measure of the rates of participation in certain types of crimes by race, but being a valid benchmark requires that suspects, regardless of race, are equally exposed to police officers.
  • We found that black pedestrians were stopped at a rate that is 20 to 30 percent lower than their representation in crime-suspect descriptions. Hispanic pedestrians were stopped disproportionately more, by 5 to 10 percent, than their representation among crime-suspect descriptions would predict.

Internal Benchmark Analyses

  • We compared the racial distribution of each officer's stops to a benchmark racial distribution constructed to match the officer's stops on time, place, and several other stop features.
  • This analysis identified 15 officers who stopped more blacks and hispanics than their colleagues, while 14 officers stopped fewer. This means 0.5 percent of the 2,756 NYPD officers most active in pedestrian-stop activity were flagged as having stop patterns warranting further investigation. Those 2,756 most active officers accounted for 54 percent of the total number of 2006 stops. The remaining stops were made by another 15,855 officers, for whom an accurate internal benchmark could not be constructed, mostly because they conducted too few stops.

Post-Stop Outcomes

Differences in stop outcomes experienced by different race groups are substantially reduced when we ensure that comparisons of outcomes are made only between stops that are truly comparable in terms of, for instance, the time of day and location of the stop.

  • Officers frisked white suspects slightly less frequently than they did similarly situated nonwhites (29 percent of stops versus 33 percent of stops).
  • Black suspects are slightly more likely to have been frisked than white suspects stopped in circumstances similar to the black suspects (46 percent versus 42 percent).
  • The rates of searches were nearly equal across racial groups, at between 6 and 7 percent. However, in Staten Island, the rate of searching nonwhite suspects was significantly greater than that of searching white suspects.
  • White suspects were slightly more likely to be issued summons than were similarly situated nonwhite suspects (5.7 percent versus 5.2 percent). On the other hand, arrest rates for white suspects were slightly lower than those for similarly situated nonwhites (4.8 percent versus 5.1 percent).


The NYPD should review the boroughs with the largest racial disparities in stop outcomes.

The NYPD should identify, flag, and investigate officers with unusual stop patterns.

All officers should explain to pedestrians why they are being stopped.

New officers should be fully conversant with stop-question-frisk (SQF) documentation.

The UF250 form, which must be filled out after an SQF, should be revised to capture data on use of force.

The NYPD should consider modifying the audits of the UF250 form.


Does the New York City Police Department (NYPD) stop pedestrians of some races disproportionately? Questions like this one have been asked about police departments in many cities. Inevitably, the question raises a second one: disproportionately compared to what? Over the years, researchers have proposed a range of standards against which to compare the race distribution of police stops. For instance, investigators have argued that with race-neutral policing the race distribution of police stops should match that of the residential population in the city, or the race of arrestees, or that of crime suspects. We refer to the comparison of the race distribution of police stops to other standards as an approach to police performance evaluation using "external benchmarks." In Analysis of Racial Disparities in the New York Police Department's Stop, Question and Frisk Practices (RAND Corporation: Santa Monica, California, 2007) by Dr. Greg Ridgeway (the "RAND report"), we highlight problems with the use of external benchmarks, and propose a powerful alternative approach for analyzing bias in police performance that uses what we call an "internal benchmark."

Where did RAND get the data to evaluate race bias in pedestrian stops conducted by the NYPD? In 2007, NYPD provided RAND with data on citywide crime for the years 2005 and 2006. These data were used to construct the external benchmarks against which we compared the race distribution of pedestrians stopped by NYPD officers in 2006.

Is there a problem with the data NYPD provided to RAND? On May 3, 2013, NYPD informed RAND that the data it had provided on 2005 and 2006 violent crime suspects contained errors. NYPD then provided RAND with corrected data. Although most data on the race of violent crime suspects showed only minor differences with the data originally provided to RAND — differences which NYPD attributed to routine updates and audits — data in the "Other" race category were significantly different than those originally provided to RAND in 2007.

Do the errors in the original dataset invalidate the external benchmark analyses RAND described in the RAND report? No. The erroneous data were not used in the analyses reported in the RAND report. As noted in the report, the three external benchmark analyses that made use of violent crime suspect descriptions "used data only on black, Hispanic, and white suspects, since other racial groups had counts that were too small for statistical analysis" (RAND report, p. 17). Thus, when the proportions of black, Hispanic and white crime suspects were compared to the race distribution of pedestrians stopped on the street, both datasets included only those cases that had been coded by NYPD as black, white, or Hispanic. The "Other" race category was not included as part of the population on which proportions were calculated.

Why did RAND exclude the "Other" and "Asian" race categories from its analyses? Was it because of the errors in the data NYPD sent RAND? RAND's rationale for excluding the "Other" and "Asian" race categories resulted not from the citywide crime data provided by the NYPD, but because the number of pedestrian stops for these race groups was too small to conduct reliable statistical analyses. "Other" race was excluded from all external benchmark analyses, including those that made no use of crime suspect data or any other citywide crime data, such as the census benchmark analyses. This decision was made because of the low number of pedestrian stops associated with this race category and the difficulty of conducting reliable statistical analyses with such small numbers.

Does exclusion of some race categories from the benchmark analysis compromise the validity of the external benchmarks for whites, blacks and Hispanics? No. Using just the three largest race groups allows for very precise estimates of differences in the relative race distributions of pedestrian stops and comparison benchmark data. Eliminating small race categories simplifies the model and improves its fit with the data while focusing on comparisons that can be made with meaningful levels of precision. The principal shortcoming of our approach is that the results identify the representation of racial groups in the stop and benchmark data for all blacks, Hispanics and whites, rather than for the broader reference group of all race groups. Since blacks, Hispanics and whites make up 97% of the stop data, and "Other" and "Asian" race groups are too small in the stop data to estimate benchmark comparisons with precision, this narrower frame of reference does not compromise our analyses.

Were there other errors in the data NYPD provided to RAND? The only errors that NYPD reported to RAND concerned the crime suspect description data for 2005 and 2006. In addition to the significant correction that has been made to the "Other" race category, NYPD made other minor updates to other race data. These other corrections result in changes to the race distribution of suspects RAND used in its analyses that are quite small (none larger than a change of 0.21%). As such, these updates would not meaningfully alter the findings described in the RAND report.

Should people rely on the external benchmarks reported in the RAND report to determine if NYPD was engaged in race-neutral pedestrian stops in 2006? No, external benchmarks are, as emphasized in the RAND report, "fraught with challenges" (RAND report, p. 19). Indeed, a central objective of the external benchmark comparisons offered in this report was to emphasize how poorly external benchmarks serve as a measure of race-neutral policing. As emphasized in the conclusions to this section of the RAND report: "Importantly, this chapter has shown that the conclusions from external benchmarking are highly sensitive to the choice of benchmark. In other words, the results of any analysis using external benchmarks may vary drastically depending on which benchmark is used" (RAND report, p. 19). Instead, the report goes on to present a novel "internal" benchmark approach that compares the behavior of individual officers to others with similar responsibilities. The report argues that the internal benchmarking method avoids many of the pitfalls highlighted in the use of external benchmarks.

The research described in this report was supported by the New York City Police Foundation and was conducted under the auspices of the Center on Quality Policing (CQP), part of the Safety and Justice Program within RAND Infrastructure, Safety, and Environment (ISE).

This report is part of the RAND technical report series. RAND technical reports may include research findings on a specific topic that is limited in scope or intended for a narrow audience; present discussions of the methodology employed in research; provide literature reviews, survey instruments, modeling exercises, guidelines for practitioners and research professionals, and supporting documentation; or deliver preliminary findings. All RAND reports undergo rigorous peer review to ensure that they meet high standards for research quality and objectivity.

This document and trademark(s) contained herein are protected by law. This representation of RAND intellectual property is provided for noncommercial use only. Unauthorized posting of this publication online is prohibited; linking directly to this product page is encouraged. Permission is required from RAND to reproduce, or reuse in another form, any of its research documents for commercial purposes. For information on reprint and reuse permissions, please visit

RAND is a nonprofit institution that helps improve policy and decisionmaking through research and analysis. RAND's publications do not necessarily reflect the opinions of its research clients and sponsors.