The United States is currently going through a period of renewed attention to questions of racial justice in policing. Public accusations of “driving while black” have prompted state and local governments to mandate the collection of systematic data to assess whether racial disparities in policing are as pervasive as critics have suggested and if they are, what drives them. Using quantitative data, researchers have shown that white and black drivers see different outcomes once stopped by the police (Baumgartner, Epp, and Shoub Reference Baumgartner, Epp and Shoub2018; Epp, Maynard-Moody, and Haider-Markel Reference Epp, Maynard-Moody and Haider-Markel2014; Gelman, Fagan, and Kiss Reference Gelman, Fagan and Kiss2007; Peffley and Hurwitz Reference Peffley and Hurwitz2010; Weitzer and Tuch Reference Weitzer and Tuch2006) and are treated differently by officers (Voigt et al. Reference Voigt, Camp, Prabhakaran, Hamilton, Hetey, Griffiths, Jurgens, Jurafsky and Eberhardt2017). Building from these studies, researchers have in turn shifted their attention to examining what may be related to and possibly causing these disparities. This research has shown that the disparate treatment depends on the intersectional characteristics of the driver (Christiani Reference Christiani2020; Fagan and Davies Reference Fagan and Davies2000; Fagan and Geller Reference Fagan and Geller2015), characteristics of the surrounding area (Dollar Reference Dollar2014; King and Wheelock Reference King and Wheelock2007; Smith Reference Smith1986), local descriptive representation (Baumgartner, Epp, and Shoub Reference Baumgartner, Epp and Shoub2018; Eckhouse Reference Eckhouse2019), departmental policy (Baumgartner, Epp, and Shoub Reference Baumgartner, Epp and Shoub2018; Mummolo Reference Mummolo2018a, Reference Mummolo2018b), and officer-characteristics (Baumgartner et al. Reference Baumgartner, Bell, Beyer, Boldrin, Doyle, Govan, Halpert, Hicks, Kyriakoudes, Lee, Leger, McAdon, Michalak, Murphy, Neal, O'Malley, Payne, Sapirstein, Stanley and Thacker2020; Theobald and Haider-Markel Reference Theobald and Haider-Markel2009).
The consequences of these disparities are numerous. By definition, they disproportionately expose black drivers to police contact, which is inconvenient at best and physically harmful at worst. More broadly, negative interactions with the criminal justice system politically demobilize the public and decrease the legitimacy of the system and government in the eyes of the public (Gibson and Nelson Reference Gibson and Nelson2018; Lerman and Weaver Reference Lerman and Weaver2014; Mondak et al. Reference Mondak, Hurwitz, Peffley and Testa2017; Tyler and Jackson Reference Tyler and Jackson2014; Walker Reference Walker2014; Weaver and Lerman Reference Weaver and Lerman2010; White Reference White2019).
This article builds on previous work that examines what is linked to disparate outcomes by focusing on two facets of the use and purpose of traffic stops that have been frequently noted but gone understudied. First, traffic stops are used both to ensure and increase road safety (a safety stop) and as a supplemental investigative, crime-fighting tool (an investigatory stop) (Baumgartner, Epp, and Shoub Reference Baumgartner, Epp and Shoub2018; Epp, Maynard-Moody, Haider-Markel Reference Epp, Maynard-Moody and Haider-Markel2014). Second, traffic stops afford an officer a degree of discretion in what will transpire and even more discretion in the specific case of who is searched (Glaser, Spencer, and Charbonneau Reference Glaser, Spencer and Charbonneau2014). In this study, we question whether stop purpose may interact with and amplify other relationships already documented in instances where officers have a lot of discretion (i.e., in whom to search).
We argue that safety stops tend to be associated with less discretion on the officer's part regarding who should be searched, as these stops are often driven by a straightforward interaction where an officer observes an infraction and seeks to issue a ticket and move on. Investigatory stops, on the other hand, are associated with a higher level of discretion. In these stops, officers look to investigate potential criminality and use the stop as a reason to gain more information. The differential degree of discretion in each circumstance is important, as in low information, but in high discretion situations, individuals often rely on implicit biases and/or institutional training that inculcates criminal profiles to supplement decision making. As this relates to policing, we expect that officers might use personally held or institutionally taught “memes of suspicion” to make decisions in a given interaction (Fagan and Geller Reference Fagan and Geller2015). One of the most ingrained tropes that guide policing decisions is likely that of the young black male as criminal (Anderson Reference Anderson2010; Correll et al. Reference Correll, Park, Judd and Wittenbrink2002; Eberhardt et al. Reference Eberhardt, Goff, Purdie and Davies2004; Eberhardt Reference Eberhardt2019; Sagar and Schofield Reference Sagar and Schofield1980; Todd et al. Reference Todd, Thiem and Neel2016). Taken together, we expect that the role of driver race in the subsequent interactions is likely amplified when officers have more discretion.
To evaluate this expectation, we use publicly available records on individual traffic stops in CT, IL, MD, and NC since 1999 or later, which amounts to more than 40 million individual stop records. With such a large database, we can assess whether identified disparities are amplified when the traffic stop is likely investigatory in nature when officer discretion is assumed to be higher. Additionally, we account for as many additional factors that predict a search as possible, based on the information collected by each state. We find that: (1) on average when a stop has an investigatory purpose, race plays a larger role; (2) on average when a search follows an investigatory stop, black male drivers are less likely to be found with contraband than white male drivers; (3) on average, we find that men are much more likely than women to be searched; and (4) the substantive relationship between race, gender, and stop purpose varies across the states.
By carefully looking at the intersection of race and the initial stop purpose, we highlight some dynamics underlying and linked to the “driving while black” phenomenon: the influence of the race of the driver on an officer's likelihood of initiating a search is amplified during investigatory—rather than safety—stops. In showing this, we add to the literature on race, policing, and policy by highlighting a specific aspect of a stop that could be altered by departmental policy. While it is impossible to eliminate all biases at work, it may be possible to limit discretion or narrow the use of traffic stops to better constrain the impact that these biases have on policing outcomes.
Using traffic stops to fight the war on crime
The politics and the practice of policing changed after the 1960s with the development of a more “proactive” policing strategies (see Vitale Reference Vitale2018) and the politically popular “tough on crime” approach. Police agencies were encouraged to use traffic stops as a tool to fight the war on crime, and more particularly the war on drugs. Traffic stops were seen as a useful tool because they encompass the majority of all police-civilian encounters (Epp, Haider-Markel, and Maynard-Moody Reference Epp, Maynard-Moody and Haider-Markel2014), making them a prime candidate for an expansion in the number of interactions where police officers could proactively search for crime or drugs. Because drivers routinely violate some aspect of the traffic or vehicle codes, officers often have the legal right to pull them over. During that routine interaction, the officer has an opportunity to converse with the driver, run their plates and license numbers through a computer search and, with consent or after developing the probable cause, search the vehicle and/or motorist (Epp, Haider-Markel, and Maynard-Moody Reference Epp, Maynard-Moody and Haider-Markel2014; Remsberg Reference Remsberg1995; Tyler, Jackson, and Mentovich Reference Tyler, Jackson and Mentovich2015).
The strategy of using traffic stops to fight the war on crime implies that the value of the traffic stop is not to keep the roads safe, but to find criminals and arrest them (or let them know that the police are watching closely). But, of course, some traffic stops are just what they seem: a high reading on a radar gun or an observation of a driver running a red light or a stop sign. These “traffic safety” stops must therefore be distinguished from “investigatory” stops those used as a pretext for a conversation and possibly more action (Epp, Haider-Markel, and Maynard-Moody Reference Epp, Maynard-Moody and Haider-Markel2014). Police leaders recognize that the vast majority of these pretextual traffic stops would come up fruitless: “It is a numbers game” is how one highway patrol officer explained it; “you have to kiss a lot of frogs before you find your prince” (see Webb Reference Webb2007). But, the reasoning went, if patrols slightly delayed and inconvenienced a thousand not-quite-innocent drivers (after all, they had broken some law, such as having an expired registration tag, or a broken tail light) in order to find a few drivers with illicit drugs or other contraband, the price in public safety was worth the slight inconvenience. This high-contact mode of policing gained a firm legal footing in 1996 when in Whren versus United States the Supreme Court ruled that officers could selectively enforce traffic laws, stopping only some rule-breakers and letting others go unimpeded.
Critics of the Whren ruling have suggested that it gives the police an open pass to profile drivers based on their race or other characteristics. Given how widespread biases are and the fact that many police agencies train officers to seek out people based on a criminal profile (Epp, Haider-Markel, and Maynard-Moody Reference Epp, Maynard-Moody and Haider-Markel2014), it seems plausible that the police would be more likely to stop motorists fitting a stereotypical criminal profile (e.g., a black or Latino male). Indeed, study after study has documented that black and Latino drivers are substantially more likely to be searched or arrested following a traffic stop than white drivers and that they are frequently pulled over at rates that far exceed their numbers in the population (Baumgartner, Epp, Shoub, and Love Reference Baumgartner, Epp, Shoub and Love2017; Baumgartner, et al. Reference Baumgartner, Christiani, Epp, Roach and Shoub2017; Burch Reference Burch2013; Epp et al. Reference Epp, Maynard-Moody and Haider-Markel2014; Fagan and Davies Reference Fagan and Davies2000; Gelman, Fagan and Kiss Reference Gelman, Fagan and Kiss2007; Harcourt Reference Harcourt2003; Lerman and Weaver Reference Lerman and Weaver2014; Matthew Petrocelli, Piquero, and Smith Reference Matthew Petrocelli, Piquero and Smith2003; Moore Reference Moore2015; Peffley and Hurwitz Reference Peffley and Hurwitz2010; Pierson et al. Reference Pierson, Simoiu, Overgoor, Corbett-Davies, Ramachandran, Phillips and Goel2017; Tillyer and Engel Reference Tillyer and Engel2013; Tillyer, Klahm, and Engel Reference Tillyer, Klahm and Engel2012; Tomaskovic-Devey, Mason, and Zingraff Reference Tomaskovic-Devey, Mason and Zingraff2004). Similarly, many of these studies have also identified a gendered component to stops and searches following a stop: male drivers are much more likely to be searched than female drivers (e.g., Baumgartner, Epp, and Shoub Reference Baumgartner, Epp and Shoub2018; Epp, Maynard-Moody, and Haider-Markel Reference Epp, Maynard-Moody and Haider-Markel2014). Female drivers are not seen as suspicious, but their racial group membership conditions this perception—black and Latina women are seen as more suspicious, and thus more likely to experience a search than their white female counterparts (Christiani Reference Christiani2020). As such, it is important to consider the effect of gender and its interaction with race in policing.
A key element in the use of traffic stops for investigatory purposes that many have pointed to as one reason for these disparate outcomes is the high level of discretion afforded to the police officer in determining which drivers might be worth investigating. In such ambiguous, low-information and high-discretion situations, cultural stereotypes can lead to predictable differences in behavior through implicit bias (see Anderson Reference Anderson2010; Correll et al. Reference Correll, Park, Judd and Wittenbrink2002; Eberhardt Reference Eberhardt2019; Eberhardt et al. Reference Eberhardt, Goff, Purdie and Davies2004; Payne Reference Payne2001, Reference Payne2006; Sagar and Schofield Reference Sagar and Schofield1980) and through institutionally taught and enforced beliefs about who is likely to be criminal (Epp, Haider-Markel, and Maynard-Moody Reference Epp, Maynard-Moody and Haider-Markel2014; Fagan and Geller Reference Fagan and Geller2015). Either possibility leads to a disproportionate focus on black and Latino male drivers by law enforcement. Moreover, previous research has shown that on average black drivers are less likely to be found with contraband than comparable white drivers. This combination of higher search rates with lower contraband hit rates, suggests an “over-targeting” of minority drivers (see Ayres Reference Ayres2002; Becker Reference Becker1957, Reference Becker1993; Glaser et al. Reference Glaser, Spencer and Charbonneau2014; Goel, Roa, and Schroff Reference Goel, Rao and Shroff2016; Knowles, Persico, and Todd Reference Knowles, Persico and Todd2001).
Hypotheses
If traffic safety stops are more commonly just what they seem, investigatory stops are more commonly used as a pretext for investigation based on a generalized suspicion, then we should see more targeting of potential “criminals” in traffic stops with an investigatory purpose. When an officer pulls over a driver for a safety reason, race will play less of a factor as the officer's suspicion is not highlighted at the outset, the goal of the stop is not to search for underlying suspicious behavior but to stop the dangerous driving behavior. However, when an officer pulls over a driver in order to investigate them further, the officer's suspicion has by definition already been highlighted, even before the stop. Once the stop has been initiated, the officer's goal is to determine if the driver merits further investigation, such as a search. Here, biases and practice may lead to differential rates of search. Given that racial stereotypes may be motivating the decision to search black drivers during investigatory stops, we expect a higher degree of targeting black drivers following investigatory stops, compared to safety stops. This over-searching of black drivers is the result of stereotype-driven decision making, not good policing practice. Correspondingly, we expect that the contraband hit rates will be lower when black drivers are searched following investigatory stops, as the search is less likely to be based on justifiable suspicion.
These ideas are the basis for the following hypotheses, which we test in subsequent analyses. We apply each set of hypotheses to male and female drivers, separately.
Hypothesis 1 (H1): The probability of search will be higher when the driver is black, compared to white.
Hypothesis 2 (H2): The probability of search will be higher when the stop involves an investigation, compared to a traffic safety stop.
Hypothesis 3 (H3): Racial disparities in search rates will be higher for drivers subject to an investigatory stop than those subjected to a safety stop.
Hypothesis 4 (H4): Black drivers will be less likely than white drivers to be found carrying contraband, and this relationship will be larger for investigatory stops than safety stops.
As noted previously, scholarship has repeatedly demonstrated that police scrutiny is concentrated on male drivers. This fits with prevailing criminal stereotypes, which center on men and young men of color in particular. However, all women are not equally treated without scrutiny—black and Latina women are more likely to experience a search than white women.Footnote 1 We therefore test each hypothesis by gender, looking at male and female drivers separately.
Data and methods
Many law enforcement agencies across the country make some basic traffic stop data available, but four states mandate the collection and public availability of detailed contextual information about each traffic stop from (almost) every police agency, not only the highway patrol: CT, IL, MD, and NC.Footnote 2 While these four states are from different regions of the country and have different socioeconomics and racial make-ups, they are only four states of 50 in the union, so some caution in regard to the generalizability of subsequent results is warranted. Across the states, the information collected after each traffic stop varies, but always includes the race and gender of the driver stopped, the reason for the stop, the outcome of the stop, whether or not a search occurred, and whether contraband was found.
It is important to note that we only include data regarding the driver. Both IL and NC occasionally collect some data on passenger searches, but only when a search is conducted. This leaves no information about passengers present but not searched. Similarly, we omit checkpoint stops from NC (the only state where these stops are included), because only drivers passing through the checkpoint who were searched are mandated to be recorded. Furthermore, where possible (in NC and IL) we omit nondiscretionary searches (those coded as an incident to arrest), as these searches are procedural and do not fit our theory of officer discretion. Table 1 shows the number of police agencies in our dataset, the number of stops and searches, and the percent of drivers who are searched. See Appendix A for more details on data that is not used in our analyses.
Note: Table includes observations for black and white drivers only.
The primary dependent variable is whether a driver was searched after being pulled over. As is clear in Table 1, searches are relatively rare across the 1,675 agencies in our dataset. Looking at all drivers together, the total search rate in a state is <5% for men and 3% for women, though this varies by racial group. However, the last two columns show that black and white drivers are subject to searches at vastly different rates. For example, black drivers are searched at a little over 3 times the rate of white drivers in IL.
Table 2 provides information on contraband hit rates. For all the searches identified in Table 1, it shows the number yielding contraband, and the “hit rate” or percent of searches leading to contraband. A major takeaway is that searches do not typically yield contraband; indeed, the “hit rate” is only about 26%. The table also shows differences by race; searches of blacks are slightly less likely to yield contraband in every state for both male and female drivers.
Note: Table includes observations for black and white drivers only.
To test our hypotheses, we conduct logistic regressions predicting, separately, (1) if a driver was searched and (2) if contraband was found. The key independent variables are driver race, stop purpose, and their interactions. The race variable is categorical with values for white and black; all other races and ethnicities are excluded. White is the baseline racial category and as a result, the coefficient for black drivers is a black driver's likelihood to be searched as compared to a white driver.
Next, we generate a binary stop type variable—either safety or investigatory—from the list of possible stop purposes used by each state. In each state, officers are asked to pick from a list of possible reasons for making a stop. In the datasets for each state, we take this information and then group those stop purposes as either safety-related or investigatory. This classification is informed by the distinctions drawn by Epp, Maynard-Moody, and Haider-Markel (Reference Epp, Maynard-Moody and Haider-Markel2014) between safety and investigatory stops, which they developed through surveys and interviews with citizens of KS city and in-depth study of policing in that city. They found that police officers were much more likely to use regulatory infractions as a basis for investigatory stops, as opposed to stops for purposes such as speeding or running a red light, which they reasoned were more directly related to promoting traffic safety. Our classifications follow the same basic logic and are shown in Table 3. Of course, these classifications are only approximations as we have no way of knowing what precise motivations an officer had for making any particular traffic stop. In turn, this means that whatever pattern is detected will likely underestimate the substantive relationship.
In addition to driver race and stop type, we include a number of control variables. Three states make available a variable (anonymously) identifying the officer who made the traffic stop. We generate a “high disparity officer” variable coded as 1 if the officer has: (a) at least 50 stops of white drivers; (b) at least 50 stops of black drivers; (c) an overall search rate higher than the average for their agency; and (d) a rate of search for black drivers at least twice that of white drivers. This allows for a conservative test of the hypothesis and common claim that disparities are due to “bad apple,” officers. When this counterpoint or explanation is raised, those proposing this explanation either implicitly or explicitly assume that “bad apples” are rare. However, a descriptive look at the data belies this point: for example, one-third of all officers in NC are identified as such. For our analysis, this means that any detected relationships exist in the face of a conservative definition of and control for “bad apples.”Footnote 3
Data from IL include a variable for the age of the vehicle (or rather, model year, from which we calculate the age of the vehicle based on the date of the stop). Since wealthier people may replace their cars more often, we include vehicle age as a proxy for economic status. Therefore, if a race effect persists after controlling for vehicle age, it relates to the effect of race above and beyond that of economic status. If drivers are more likely to be searched when they are driving late at night, on the weekends, these effects will be captured with the control variables for day or week and time of day.
There is significant variation in search rates by police agency: officers from some agencies search at much higher rates than others. We therefore include agency fixed effects. This requires dropping agencies with relatively low numbers of stops as there is no reliably sufficient information for the models to estimate the fixed effects in these (i.e., the models will not converge).Footnote 4 In NC, we set this threshold at 10,000 stops, dropping 199 of 343 agencies but only 2.6% of the total observations. In IL, this threshold was similarly set to 10,000 stops, dropping 729 of 1,130 agencies and 6.5% of the total observations.Footnote 5 To ensure our results are not dependent on these stop thresholds, we conduct a robustness check in the online appendix which does not use these thresholds.
Using these datasets, we conduct the most conservative analysis we can, based on the data made available in each state. We should note because we have millions of observations, statistical significance is all but guaranteed. Nevertheless, because searches are rare (occur in <5% of traffic stops), the large N gives us analytical power. For each state, we estimate a logistic regression predicting whether a given traffic stop will lead to a search, controlling for other factors. Since each state collects different contextual factors about the traffic stop, we estimate a slightly different model in each state. The independent variables included in each regression are listed in Table 4. We should note that by including an hour of the day in the NC model, we are forced to drop millions of observations because the time of the stop was not recorded. In robustness checks in the appendix, we re-estimate the models for NC excluding hour of day fixed effects. The results remain the same. For this analysis see the appendix.
Note: X indicates the variable was included. A blank indicates the variable was not available.
Analysis of the interaction of stop purpose and race
Who gets searched?
Table 5 reports the results of logistic regressions estimating the likelihood that a driver is searched following a traffic stop. A separate regression is fit by state and gender, using the variables described above and fixed effects for the police agency that conducted the stop. Recall, hypotheses are that black drivers (H1) and drivers subject to an investigatory stop (H2) are more likely to be searched. H3 is that the relationship between driver race and being searched is amplified when the traffic stop was motivated in the first place by a suspicion of criminal behavior.
Note: *p < 0.05. White drivers are the reference category for black drivers.
We find broad support for our hypotheses. However, due to the interaction in the model between driver race and stop type, it is difficult to interpret any coefficient in isolation. As a result, we proceed slowly. First, we hypothesized that black drivers will be more likely to be searched than white drivers (H1). For men, this hypothesis finds strong support. In every state, for safety stops, the coefficient associated with the black driver variable is positive and significant, meaning black drivers are more likely to be searched than the white reference category. These disparities only grow in investigatory stops as we will discuss later. These effects persist even with the control variables included in the models. However, results are mixed when isolating female drivers. In IL, black female drivers are more likely to be searched than their white counterparts, but the opposite is true in MD and NC. In CT, there are no statistically meaningful differences along racial lines in the likelihood of a search for female drivers. These results justify the decision to separate our analyses by gender and suggest that stereotypical criminal profiles are a major driver of both racial and gender disparities.
Second, we hypothesized that those subject to an investigatory rather than a safety stop will be more likely to be searched. In three of the four states, there is statistically significant support for this hypothesis: drivers pulled over in investigatory stops are more likely to be searched, compared to those pulled over for safety violations. In CT, IL, and NC, the investigatory stop coefficient is positive and statistically significant as hypothesized. In MD, however, the investigatory stop coefficient is negative and significant, counter to our prediction. Results are substantively the same for male and female drivers.
Finally, support for H3 would be seen if the coefficient associated with the interaction term is positive and statistically significant. In MD, NC, and IL, we find support for male and female drivers. This means that black drivers (regardless of their gender) pulled over for an investigatory purpose are facing an added penalty, above and beyond the impact of just being black or just being pulled over in an investigatory stop. Results are different for CT. For male drivers, there is no statistically meaningful evidence of the interactive effect we find for the other states. For female drivers, the interaction appears to work in the opposite direction, meaningful that white female drivers are more likely to be searched after investigatory stops. This further emphasizes the different gender dynamics driving police behavior.
Figure 1 helps illustrate the substantive importance of these findings, showing the predicted probabilities drawn from the estimates in Table 5. Panel A looks at male drivers and panel B female drivers. The lines at the top of each bar show 95% confidence intervals. In every state, black drivers are more likely to be searched following either an investigatory or safety-related stop. Additionally, in every state except for MD drivers (white or black) are more likely to be searched after an investigatory stop. For example, in IL, the predicted probability of a black driver being searched following a safety stop is approximately 2.5%, while a white driver in a similar situation sees a predicted probability of being searched of approximately 1.0%. Figure 1 also makes clear the variation across states: in some states, there is only a minor increase in the probability of being searched following an investigatory stop, while in others this is a much larger increase. Furthermore, the added penalty black drivers face following an investigatory stop varies. This hints that there is important variation between the states (i.e., culture, policy, etc.) that could be explored in future studies, but this is outside of the scope of the current paper.
Figure 1b, which plots the predicted probabilities for female drivers, provides only mixed support for our hypotheses. In IL, MD, and NC, black female drivers are more likely to be searched after an investigatory stop, but the opposite is true in CT. Note too that the confidence intervals are wider, indicating less certainty about the point predictions. Female drivers are less likely to be searched than males, so there are fewer observations.
Figure 2 shows the increase in the predicted probability of search for black drivers as a difference-in-difference for the four predicted probabilities shown in Figure 1. Black drivers are generally more likely to be searched than white drivers, but this figure shows how that disadvantage grows when the underlying stop is investigatory rather than safety-related. This demonstrates the distinct impact of the investigatory stop and race interaction, which amplifies the risk of a search for black drivers. Figure 2 demonstrates that for men we see a consistent racial penalty for blacks in investigatory stops. In NC, this accounts for a roughly 3% increase in the likelihood of being searched, more than half the average search rate. For women, we see a much smaller effect than for men, about one-third the average effect, and this relationship is not significant in CT or NC.
In addition to facilitating a test of our hypotheses, the models demonstrate that there are important driver characteristics to consider, beyond the race of the driver. Age has a consistently negative and significant effect: searches are targeted on younger drivers. There are mixed findings for out-of-state drivers. In CT, out-of-state drivers are more likely to be searched to a significant degree, while in MD the effect is negative and not statistically significant. We see that high disparity officers (or “bad apples”) are always more likely to search drivers to a significant degree, but this is by construction as we defined these officers as having searched drivers at above the mean search rate for their agency. The importance of this variable is that, where present, it does not reduce the powerful racial effects apparent in the other coefficients; “bad apples” are far from the entire story. We also see that in IL (the only state with the information), vehicle age has a significant adverse effect on search rates, which reinforces previous findings as well. Importantly, where we can control for more variables, none of them causes the race effects to be attenuated.
Overall, results support our hypotheses—very clearly for male drivers, and somewhat less so for female drivers. Black drivers are more likely to be searched than white drivers following a traffic stop. We show that in most cases those pulled over in investigatory stops are searched more than those pulled over in safety stops (and in the case where this is not true, there is still a large racial disparity). Finally, we demonstrate the relationship between race and investigatory stops is interactive, that is, that there is an additive effect for being both black and being in an investigatory stop that is more than the sum of its parts. In the next section, we turn to the rates at which these searches yield contraband.
Who is found with contraband?
The previous analysis ignores whether observed differences are due to differential criminality rates. To address this, we perform an outcome-based test to examine whether the drivers that are searched tend to be found with contraband. The logic of an outcomes-based test is as follows: if black drivers are found to be carrying contraband more than white drivers then searching black drivers more is justified, as the police are just targeting their searches on those who carry contraband. Conversely, if black drivers are found with contraband less often, then the higher search rates are not justified by correspondingly high rates of contraband possession. Table 6 reports the results of the logistic regression predicting the likelihood of finding contraband given a driver has been searched. Thus, this analysis is limited to drivers who are searched, rather than all drivers.
Note: *p < 0.05. White drivers are the reference category for black drivers. The analysis includes male drivers only.
We hypothesized that black drivers will be less likely to be found carrying contraband, less contraband will be found in investigatory stops, and this disparity will be greater for investigatory stops than safety stops (H4). The results shown in Table 6 generally support our hypothesis, for men and women. In three of four states, we see a negative and statistically significant coefficient associated with the black driver variable, meaning that black drivers in safety stops are less likely to be found with contraband. In NC, we see that black drivers in safety stops are actually more likely to be found with contraband which is contrary to our expectations. Our expectations for investigatory stops are largely confirmed for men, but we see less support for women. For White men, in two states, we see a positive and significant coefficient on the investigatory stop variable, meaning White drivers in investigatory stops are more likely to be found with contraband than White drivers in safety stops. However, for White women, we see a negative and significant coefficient in two states and a positive and significant coefficient in one state, largely contrary to our expectations. For Black drivers, the relationship is more stable across gender, and in line with our expectations. Except for the case of Black women in CT (where we do not see statistical significance), the interaction term is negative, and in four cases it is both negative and significant. This means that black drivers searched after an investigatory stop are less likely to be found with contraband. While support is mixed, this generally supports H4. While Black drivers are more likely to be searched in investigatory stops, they are less likely to be found with contraband, demonstrating that the racial disparities observed are not explained by “good policing”.
To better illustrate these findings, we once again turn to predicted probability plots. Figure 3 shows the predicted probabilities for finding contraband across race, gender, and stop type for all four states. As in the previous section, 95% confidence intervals are shown, and all the other variables are held to their mean or mode as is appropriate.
Figure 3a shows that contraband hit rates following stops of a given type are lower for black male drivers compared to white males, with one exception: safety stops in NC. (To see this, compare bars of the same shade of grey within each state.) This demonstrates support for H4. The figure also demonstrates that the racial disparity is higher for investigatory stops than it is for safety stops. (To see this, compare the difference between the darker bars and see that it tends to be higher than the difference between the lighter bars.) In the case of NC, we see that while black drivers are more likely than white drivers to be found carrying contraband in safety stops, the reverse is true for investigatory stops. Figure 3b shows very similar results. Except for NC safety stops, Black women are less likely to be found with contraband following a stop. We see the racial disparity tends to be slightly higher in investigatory stops for women, but this relationship is not as stark as it is for men.
Combined, these analyses paint a compelling, largely consistent, and bleak picture. Black drivers are more likely to be searched by the police, and these searches are not justified by contraband hit rates—this is especially true for black men. These disparities are exacerbated by institutionalized policing practices, in this case, the investigatory stop. Searches following investigatory stops show higher racial disparities than those following traffic safety stops, even though they are less likely to find that driver to be carrying contraband. Not only do black drivers face disparities in traffic stop treatment, but these differences are not justified by higher rates of discovery of contraband.
Discussion
Looking at more than 40 million traffic stops across four states, we asked a simple question: Are the police using the pretext of expired registration tags or broken tail lights as an excuse to conduct a criminal investigation based on a stereotype that makes young black male drivers particularly vulnerable to investigation? The answer is yes. Our findings are therefore troubling and yet they point to a simple reform that may be effective in reducing disparities: stop using the traffic code as a pretext for criminal investigations. Doing so would result in more racially equitable outcomes and would have other benefits as well.
First, the routine and high-volume use of traffic stops as a crime-fighting tool is a needle-in-the-haystack statistical proposition, and its public safety benefits must be weighed against its costs. In Whren, the Supreme Court assessed the costs to be low and implicitly made the reasonable assumption that the benefits were appreciable, given that the practice was so widespread. It is time to question that. Contraband hit rates are low, and the vast majority of contraband “hits” are very small amounts, typically not leading to arrest even when contraband is found (see Baumgartner et al. Reference Baumgartner, Epp and Shoub2018 for more information).
Beyond the low pay-off in public safety by identifying criminals and arresting them, the routine and large-scale use of the traffic code as an excuse to investigate drivers of color has a strongly negative effect on citizen trust. When we look for reasons to explain low levels of trust and cooperation between communities of color and the forces in blue sworn to protect them, it is obvious that over-targeting young men of color are not likely to breed trust and cooperation. Rather, alienation, anger, and withdrawal are predictable results of feelings of unfair interactions with the criminal justice system (Tyler and Jackson Reference Tyler and Jackson2014; Tyler, Jackson and Mentovich Reference Tyler, Jackson and Mentovich2015).
Finally, removing police traffic patrols based on investigations would allow the police to reallocate their resources to other activities: Traffic patrols could focus on reducing accidents, which kill tens of thousands of Americans each year. Other resources could be directed toward targeted investigations of criminality, not hunch- and stereotype-based investigations that typically come up empty.
Of course, we have concentrated our analyses on black and white drivers alone. There are stereotypes associated with other racial-ethnic categories, like Latinx, Asian, and Native Americans as well, that shape police treatment and traffic stop outcomes. However, because this analysis focuses on gender and stop type, in addition to race, we did not have space to sufficiently address the way that other racial-ethnic stereotypes may affect treatment and traffic stop outcomes. Previous work has demonstrated that Latinx, especially young men, are targeted for searches during police traffic stops (Baumgartner, Epp, and Shoub Reference Baumgartner, Epp and Shoub2018; Christiani Reference Christiani2020)—and that the stereotypes shaping treatment of Native Americans lead to high levels of scrutiny, but those that exist for Asians lead to lower levels of scrutiny (Christiani Reference Christiani2020). Future work may expand on the way that other stereotypes interact with stop purpose in order to produce disparate outcomes in policing.
“Driving while black” surged to the national consciousness and debate in the late-1990s. NC was the first state in the nation to mandate the collection of demographic information on routine traffic stops. It is worth remembering the premise and the supposed promise of this legislation. In an editorial praising the bill, the Raleigh News and Observer wrote:
The numbers … should settle this issue of equitable treatment once and for all…. If the patrol is, as many blacks believe, unfairly targeting them, it must be stopped immediately. If not, the patrol deserves to be exonerated (Editorial Board 1999).
Now we know the results, for NC and other states; they could hardly be clearer. But police agencies have changed from suggesting that disparities are unacceptable indicators of bias and must be eliminated to suggesting that unobserved factors explain the persistent differenced uncovered in virtually every police agency where they have been investigated. Our results show that this is not true. Moreover, they point to a simple solution: focus on traffic safety.
Supplementary material
The supplementary material for this article can be found at https://doi.org/10.1017/rep.2020.35.