My home state of California is one of the most liberal states in America — Democrats have supermajorities in both houses of the legislature — but was, ironically enough, the first to ban the use of racial preferences in state programs. That happened when voters passed Proposition 209, by a 55-45 margin, in 1996; bans like this exist today in only eight other states.1 In June 2020, the California legislature voted to put a measure, Proposition 16, on the November ballot that would repeal Prop 209 and reinstate the ability of state officials to use racial preferences.2 A few weeks before the election, I came across a nonpartisan guide to the November ballot measures; the guide had a little icon to summarize each measure. For Prop 16, the icon showed a white hand reaching down from the top of the picture and clasping a black hand reaching up from the bottom. This nicely captures the classic conception of affirmative action — a gesture of interracial fellowship to provide a “helping hand” to people at a disadvantaged, lower level.
In this article, I hope to show that although there are forms of affirmative action that may be as simple, straightforward, and fundamentally good as this helping hand, affirmative action in higher education predominantly takes the form of large racial preferences in admissions, and these are another matter indeed. Heavy racial preferences not only involve aggressive discrimination against some disfavored group (increasingly, another minority group),3 but often backfire in multiple ways, and can end up causing problems far more insidious and intractable than those they are intended to solve.
This article examines affirmative action in medical schools, but at the outset let me offer a general disclaimer. I am a law professor, not a medical school professional. I have done a good deal of original research and writing on the operation and effects of racial preferences upon law students and upon the broader patterns and consequences of affirmative action in higher education.4 My knowledge of medical school and the use of race in medical school admissions is limited and largely second-hand, but I have been interested for some time in how my findings from legal education might translate to the medical-school context, since the two have both important similarities and striking differences. I was therefore grateful for the invitation to participate in this symposium. In this essay, I consider various dimensions of affirmative action, explain some of the key findings from the existing research on law schools or undergraduate education, and then compare these patterns with what I have been able to learn about medical schools. I identify throughout some important issues that I think should be further investigated in the medical academy; but at this early stage of investigation, I view my findings as suggestive, not definitive, and I hope they will be taken in the spirit of comments from a friendly outsider.
In this article, I hope to show that although there are forms of affirmative action that may be as simple, straightforward, and fundamentally good as this helping hand, affirmative action in higher education predominantly takes the form of large racial preferences in admissions, and these are another matter indeed. Heavy racial preferences not only involve aggressive discrimination against some disfavored group (increasingly, another minority group), but often backfire in multiple ways, and can end up causing problems far more insidious and intractable than those they are intended to solve.
I. The Size of Racial Preferences
Absent special intervention, Black students and Hispanic students will be underrepresented in highly competitive colleges and graduate programs not because those institutions invidiously discriminate against them, but because there are large performance gaps between racial groups. Admissions based on objective academic standards will produce underrepresentation of Blacks, Hispanics, and American Indians, which is why universities refer to those groups as “underrep-resented minorities” (“URMs”). The performance gap is most widely documented at the high school level. The National Assessment of Educational Progress, which uses a variety of tests to assess learning among large samples of students in K-12 schools, finds that the median educational achievement of whites in high school is about four years ahead of Blacks, and about three years ahead of Hispanics.5 These gaps are mirrored in high school performance on the SAT, where average Black scores are about one standard deviation below white scores.6 The racial gap in high school GPA is smaller but still substantial.
Why do such large academic gaps exist, for Blacks in particular? Partly because housing segregation relegates Black students to lower-quality (though not necessarily lower-funded) elementary and secondary schools;Reference Hanushe and Lindset7 partly because lower average socioeconomic status and lower-quality medical care translate into lower birthweights for Blacks (which on average harms cognitive function) and less robust diets in early childhood; partly because there are racial disparities in parenting practices — propensities to keep books around the house, to use a wide variety of words in speaking to young children, to enforce regular bedtimes and limits on television — which disfavor Blacks and undermine their cognitive development.Reference Fryer and Levitt8 In short, there are many causes, none of which reflect genetic differences in racial capacity (thus, a racial performance gap is not “intrinsic”) but which are complex and require multi-layered strategies to address.
Unfortunately, in recent years, and especially in the past year, it has become common to “explain” the racial academic achievement gap as a result of “systemic racism” or “structural racism.” This has shown up even in the literature on medical education.Reference Lucey and Saguil9 The problem with this approach is that it consigns the achievement gap to some vague, unknowable, and unsolvable void of endemic inequality, when there are in fact some highly specific problems — such as housing segregation and inadequate prenatal medical care — that can be specifically identified and addressed.Reference Sander, Kucheva and Zasloff10 In any case, because of the large performance gap among 12th-graders, using strictly academic indicators for college admission would lead to substantial underrepresentation of Blacks and, to a lesser extent, Hispanics. Roughly speaking, Blacks made up 13% of American 18-year-olds in 2013, but only 5% of those with grades and test scores that put them in the top third of high school seniors in academic achievement, and only 2% of those in the top tenth. Even very elite colleges, of course, admit based on other than academic factors, and some use socioeconomic preferences — all of which reduce the Black-White gap in “admissibility.” However, such measures (as currently used) only make up a fraction of the racial gap. The recent Harvard litigation, for example, revealed that with no consideration of race, African-Americans would constitute fewer than 3% of Harvard freshmen.11
At Harvard and most elite undergraduate colleges, administrators seek to create an admitted class that roughly mirrors the racial makeup of applicants, so large racial preferences are used to bridge the gap. At Harvard, African-Americans made up about 14% of applicants, and about 14% of admittees, more than four times the number that would be admitted if race were not factored in.12 In the suit against Harvard, the plaintiff ’s expert, Peter Arcidiacono, analyzed the probability of an applicant’s admission if, on the various qualities considered by Harvard, the applicant ranked close to the median of admitted students, and one varied only the student’s race. He found that an Asian-American applicant with these characteristics had a 25% chance of admission, compared to a 36% chance for an otherwise-similar white applicant, and a 95% chance for an otherwise-similar Black applicant.13 Arcidiacono’s finding is consistent with my own analysis of the data, and implies that Harvard uses quite substantial racial preferences. Moreover, once Tier 1 schools (like Harvard) implement racial preferences, they absorb not only the “Tier 1” Black students who would qualify on race-neutral grounds, but all those who would qualify for Tier 2 schools as well. Tier 2 schools thus start their admissions process with little hope of recruiting either “Tier 1” or “Tier 2” Blacks and must consequently use even larger preferences than the Tier 1 schools. This “cascade effect” means that the preferences are usually more conspicuous at selective and moderately-elite schools than at the very top schools.Reference Sander and Taylor14
These patterns exist at law schools as well, often in even more rigid and stark forms. Among law school applicants, the Black-White gap in average LSAT scores is about one standard deviation; the gap in college GPAs (undergraduate grade-point-average, or “UGPA”) is nearly that large. Undergraduate colleges often give significant weight to non-academic factors, such as athletic prowess, leadership skills, good essays, strong letters of recommendation, and yes, often legacy status. Law schools focus heavily on the academic numbers; they are able to achieve enrollments that are about as racially diverse as their applicant pools by essentially race-norming LSATs and UGPAs, and thus admitting equivalent top shares from each racial group. Because law school applicants place great weight on the US News ranking of law schools15 (e.g., students apply to many schools and tend to enroll in the most highly-ranked school that admits them), the cascade effect operates at law schools as well. Thus, Tier 2 and Tier 3 schools use even larger preferences than those in Tier 1. These factors combined mean that there is very little racial overlap at most law schools in credentials; the 90th percentile of Black students at a given school often have credentials that are lower than the 10th percentile of their white classmates.
Medical schools share some features of these patterns but are also different in important ways. The Association of American Medical Schools (“AAMC”) gathers a variety of data on medical school admissions, and its data on the admissions cycle for 2020 matriculants allows us to measure the credential gap and admissions rate of American medical school applicants, by race (Table 1).
Table 1 Credential gaps, and the distribution of applications and matriculations AAMC medical schools, entering 2020-2021 cohort.

Source: AAMC Table A-18. Standard deviation gaps are measured by averaging the white SD and the specific race SD, and dividing this into the mean score gap between the group and whites. Percentages in the columns do not add to 100% because 21% of applicants are non-U.S. citizens, multiracial, or of other races, available at <https://www.aamc.org/media/6066/download> (last visited March 26, 2021).
Table 1 illustrates two things. First, there are very large credential gaps among applicants to medical school of different races, similar to those we observe among high school seniors applying to college and college seniors applying to law school. But second, each racial group is represented among medical school applicants in numbers that closely approximate racial representation in the applicant pool. Data from individual schools tends to show these patterns, too, so the large credential gaps we see in the overall applicant pool are probably replicated at individual schools.
This necessarily means that medical schools are using quite large racial preferences — largest for Blacks, a little smaller for American Indians, and more moderate, but still substantial, for at least some Hispanic subgroups. It would therefore not be surprising to see large performance gaps across racial lines at medical schools, too.
However, medical school admissions are different from law school admissions in two key respects. First, medical schools give substantial weight to factors other than test scores and grades. They invest significantly more time and effort (including faculty time and effort) in the admissions process, often (or, as I am told, always) interviewing a large share of applicants and a very large share of those actually admitted. Second, medical schools that are part of state university systems tend to give substantial weight to whether an applicant is an in-state resident — a preference that in some cases is comparable to the school’s racial preferences. Both of these factors imply that academic credentials will not be as starkly aligned with race in many (perhaps most) medical schools as it is in almost all law schools.
I illustrate this tangibly with data from two state graduate programs: the University of Michigan Law School and the University of Wisconsin Medical School, both of which are highly-ranked public-school programs.16 The academic indices used in these two comparisons are of my own construction, but they are based on analogous combinations of test scores and grades as weighted by the respective schools. (The academic index scales from 0 to 1000, with test scores [LSAT or MCAT] given up to 600 points, and college GPAs given up to 400 points.) Table 2 and Table 3 are thus “calibrated” in a way that allows general comparisons of the admissions rate by academic index.
Table 2 Comparative Admissions Rates at the University of Michigan Law School, 1999 By Academic “Index” and Race

Source: Sander, “Why Strict Scrutiny Requires Transparency,” in Kevin McGuire, ed., New Directions in Judicial Politics (2012), p. 288, Table 15.2.
Table 3 Comparative Admissions Rates at the University of Wisconsin Medical School, 2013 By Academic “Index” and Race

Source: Author’s analysis of database provided by UW, 2013
There are two striking similarities between the law school and medical school patterns: Blacks have much higher admission probabilities than whites within any particular index range; and admission probabilities rise steadily with higher academic indices. In other words, academic credentials matter a lot, and racial preferences are large. However, there are two very striking differences as well. First, the racial discrepancies across particular levels of academic index are much less extreme at the medical school. Second, modest rises in academic index do not have, at the medical school, the dramatic effect upon admissions that they have at the law school. At the law school, for example, fairly high index levels essentially guarantee admission; at the medical school, even extremely high scores provide no such guarantee. And at the medical school, the heavy in-state preference means that some whites with quite modest academic credentials will be admitted.
The upshot of these differences is that there is significantly more racial overlap in objective credentials at medical schools than at law schools. We should therefore expect to see less extreme performance differences at medical schools as well.
There is still genuine debate about the degree to which racial preferences in undergraduate admissions reduce Black graduation rates. That is because preferences clearly have both a positive and a negative effect upon college graduation. The positive effect occurs because preferences lift students into colleges where graduation is the norm; once admitted, it is hard *not* to graduate from Harvard, but easy not to graduate from Oregon State. The negative effect occurs because, at a school where graduation is far from assured, large preferences increase the chance of failure. My own institution (UCLA) used very large racial preferences in the early-and-mid 1990s, before Prop 209 made them illegal, and produced terrible outcomes for Black students; only 13.5% of its Black matriculants graduated with a B.A. “on time” (i.e., in four years), and fewer than half ever graduated.
A third factor which appears to distinguish medical schools from law schools and undergraduate colleges is the lesser emphasis placed upon school ranking (eliteness). In law schools, credentials that would put one near the top of the class at the University of North Carolina (ranked around #30 by US News) would put one near the bottom of the class at the University of Virginia (ranked around #9). Correspondingly, credentials that would put one near the bottom of the class at UNC would put one near the top of the class at the University of New Mexico (ranked around #79).17 As I noted earlier, students place great (arguably excessive) emphasis on attending a more elite law school, probably because more elite schools have, to some degree, better access to the “big firm” jobs that pay much higher salaries than other entry-level legal jobs. Medical schools appear to be significantly less hierarchical; the academic qualifications of students attending mid-ranked schools overlap more with those at top-ranked schools and indeed, there are medical schools ranked below the 30th position that have median student credentials higher than some schools in top ten.18 This means, among other things, that the “cascade effect” I’ve described is less likely, in the medical school context, to aggravate the problem of racial credential disparities. Here, as elsewhere, analysis of data collected by national organizations like AAMC could tell us definitively whether my inferences from somewhat fragmentary data hold up.
II. The “Mismatch Effect” and the Pipeline to Medical School
Blacks make up at about 14.5% of US undergraduates, but only 10.5% of college graduates and 8% of medical school applicants.19 Why does the Black share decline as we move through the higher education pipeline? One likely factor is affirmative action itself — in particular, the type of affirmative action that takes the form of very large admissions preferences. What I refer to as the “mismatch hypothesis” has generated a large literature among both economists (who refer to it as a “peer effects” hypothesis), psychologists and sociologists (who sometimes refer to it as the “frog pond” hypothesis).20 These hypotheses suppose that when a student is placed in an environment where the bulk of her peers (the other students) have stronger academic preparation, that student is likely to learn less, or compete less effectively, than in an environment without preferences, where the student’s peers have comparable levels of academic preparation.
There is still genuine debate about the degree to which racial preferences in undergraduate admissions reduce Black graduation rates.21 That is because preferences clearly have both a positive and a negative effect upon college graduation. The positive effect occurs because preferences lift students into colleges where graduation is the norm; once admitted, it is hard *not* to graduate from Harvard, but easy not to graduate from Oregon State.22 The negative effect occurs because, at a school where graduation is far from assured, large preferences increase the chance of failure. My own institution (UCLA) used very large racial preferences in the early-and-mid 1990s, before Prop 209 made them illegal, and produced terrible outcomes for Black students; only 13.5% of its Black matriculants graduated with a B.A. “on time” (i.e., in four years), and fewer than half ever graduated. Within ten years of the implementation of Prop 209, those rates had risen to 53% and 84%, respectively.23 The evidence on whether the positive or negative effects of preferences predominate, when eventual graduation is the outcome of interest, is mixed.
There is much less doubt about the phenomenon of “science mismatch.” A series of careful studies by psychologists and economists published in top journals over the past twenty-five years have consistently found that students receiving large preferences are less likely to persist in the sciences than they would have at a school where they did not receive a preference.Reference Aucejo and Hotz24 A student aspiring to become a chemist (for example) who receives a large preference into college, will almost certainly find herself surrounded by students with stronger credentials and more extensive science preparation. Her grades and learning will suffer, and she is likely to either switch to a less competitive major or drop out of college altogether. Importantly, most of the research in this field finds that race per se has little effect; white students who receive preferences (say, through legacy considerations) experience the effects of science mismatch to the same degree as Blacks and Hispanics.25
One symptom of the “science mismatch” phenomenon is that although Blacks are as likely as whites to express interest in a science career when they are high school seniors, they are much less likely to graduate with science degrees and still less likely to complete a science doctorate. In 2017-18, Blacks accounted for 10.5% of college graduates, but only 7.3% of bachelor’s degrees in STEM fields, and only 4.6% of doctoral degrees in STEM fields.26
This contributes to a real problem for diversity in medical education. In the first instance, the pool of minority students with enough science preparation to seriously consider medical school is eroded because mismatch increases attrition rates, as students struggling in science courses transfer to less-demanding majors. In the second instance, those minority undergraduates who do apply to medical school may well have weakened science backgrounds because mismatch has lowered their rates of learning in science courses. Among medical school applicants, the black-white gap in undergraduate science grades is twice as large as the GPA gap in non-science courses. Part of the “preference” likely given by many medical schools involves overlooking so-so science backgrounds from underrepresented minority (“URM”) applicants. It is likely that many URM students consequently struggle in the basic science courses that fill much of the first year of medical school, which in turn leads to the question of whether mismatch becomes a serious problem in medical school itself.
III. Does “Mismatch” Operate within Medical Education?
In legal education, there is a well-known racial gap in bar passage rates. A national study in 1997 found that Black graduates of ABA-approved law schools had a 38% chance of failing their first bar exam, compared to a 8% rate for whites.27 Contemporaneous state data — which is rarely public — showed similar disparities. The obvious explanation for the lower black bar passage rate was that Blacks, on average, entered law school with weaker credentials than whites. But when one controlled for LSAT scores and college grades, about half of the black-white gap in bar passage remained. Why was this?
One theory was that bar exams did not fairly assess some people’s knowledge — including Blacks to a disproportionate degree. But then Stephen Klein, a psychometrician at RAND and for many years the nation’s leading expert on bar exams, showed that when one controlled for law school grades (as well as LSAT and college grades), the unexplained black performance deficit on the bar exam disappeared.Reference Klein and Bolus28 Why, then, were black law school grades so low? In 2004, I advanced the argument that blacks received low law school grades simply because of large preferences, which systematically placed them in schools where they were at a competitive disadvantage. The extra penalty Blacks experienced in bar performance could be entirely explained by the effect of admissions preferences.Reference Sander29
Given the size of racial preferences used by many (and probably most) medical schools, it would not be surprising if the mismatch effect also hurt the learning process and educational outcomes of URM medical students. This would be particularly plausible during the first half of medical school, where students traditionally spend most of their days in large classes covering an immense quantity of challenging, technical material. In such an environment (still the norm in the first half of law school) professors tend to aim their lectures, materials and assignments at the middle of the class; in medical school, a student with weaker credentials or less science preparation than her classmates would be at a disadvantage, sometimes stimulated by and rising to the challenge, but often falling behind and learning less than she would have at a less elite medical school.
In recent decades, medical schools have introduced more varied teaching methods into those first two years. Some professors use online material and textbooks to convey the core material, and then meet with students in smaller groups to discuss it. Laboratory instruction is likely to be more individualized. Such factors might make medical education less subject to mismatch, because students are more able to set their own pace and ask their own questions.Reference See30 The University of Texas Medical Branch undertook a particularly ambitious curricular reform in the late 1990s and early 2000s, which introduced more experiential learning to early medical courses, emphasizing applied problem-solving and the early development of clinical skills. Importantly, the school found that its reforms became more effective — i.e., producing better learning outcomes, even as measured by national board exams — when the examination process was reformed as well to better match what students were now being taught.Reference Asimakis, Ainsworth, Aronson, Frye, Liberman and Rabek31
Halfway through medical school, students take their first national board exams (the “Step I Boards”), which focus on the scientific subjects students study during the first and second year of school. In general, passage rates are high — well over 90% in recent years. The administering body, the National Board of Medical Examiners (“NBME”), does not report outcomes by race, but periodic studies based on large samples of students have consistently found a large racial gap in pass rates. One of the earliest studies, which analyzed data from ten thousand students taking the Step 1 Boards between 1986 and 1988, found large racial and gender disparities in outcomes. The scores of women were, on average, about 1/3 of a standard deviation lower than men; the scores of Hispanics were about ½ of an SD lower than whites, and the scores of Blacks were a full SD lower than those of whites. Pass rates varied accordingly: 88% of white takers passed, compared to 66% for Hispanics and 49% for Blacks. The gender gap in pass rates was much smaller, about five points separating white men and women.Reference Dawson, Iwamoto, Ross, Nungester and Swanson32 Later studies of the “Step 1” exam — including the most recent, published in 2019 — have shown gradually rising pass rates, but very similar racial and gender disparities.Reference Rubright, Jordoin and Barone33
Medical students must pass two further “Step” exams to obtain a license for general practice. The “Step 2” exam, which students generally take at the end of the fourth year of medical school, examines clinical skills that are taught in the hospital rotations students complete in their third and fourth years of school. The “Step 3” exam assesses comprehensive medical knowledge and its application to specific patient-care situations. Pass rates on these exams tend to be higher than on the Step 1 exam, and notably, women tend to slightly outperform men on both Step 2 and Step 3. Black and Hispanic test scores and pass rates, however, are much lower than white rates on both Step 2 and Step 3, and the magnitudes of the differences are similar and stable over time.Reference Case, Swanson and Ripkey34
As with the bar exam, we would expect that much of the racial gap in the medical board exams is explained by the racial disparities in distribution of academic achievement — UGPAs and MCAT scores — among entering medical students. And indeed, nearly all of the research shows that these two factors do highly correlate with “Step” scores and that these explain much of the racial gap. They also consistently find, however, that even after controlling for those factors, some racial gap remains.35 We would expect this sort of residual gap if mismatch were occurring. For example, if large preferences put many Black students in schools where they are at a learning disadvantage relative to their peers, we would expect them to underperform on the Step exams relative to their academic potential as estimated by UGPA and MCAT scores. If this were so, it would be easy to test: an analysis predicting performance on a Step exam should add a control for cumulative medical school grades or, even better, a control for each student’s “relative position” within their medical school class. If mismatch is occurring, these controls should make the residual racial gap disappear or at least greatly diminish in a regression predicting Step scores.
But although the “residual racial gap” in the literature on Step scores certainly implies that mismatch may be a problem in medical schools, the gap is different in two important ways from what we observe in analyses of bar scores. First, it is smaller in the Step analyses: it generally seems to account for one-fourth to one-third of the Black-White gap in Step scores, whereas it accounts for 40%, 50%, or even more of the Black-White gap in bar scores. Second, Asian-American medical students — and in some analyses, women — also show an unexplained residual gap, even though neither of those groups are receiving large admissions preferences.
An important feature of medical education is that the first board examinations are taken relatively early — after the second year of medical school. If a student fails the Step I Boards, she is still in school and can presumably take remedial courses that improve the chances of passing on a second attempt. Indeed, many medical schools offer programs directly aimed at providing academic support both to help students prepare for the Step I Boards, and (especially) to assist those who fail them, even stretching out the time that students take to graduate in order to give them a strong grounding in the science fundamentals. This is very different from the situation in law schools, where the bar exam comes after graduation, and graduates who fail the bar neither receive nor expect any assistance from their alma mater in turning their performance around. A law graduate who fails a bar exam is often studying for a subsequent attempt while also dealing with unemployment or a law job in jeopardy from the initial failure. In other words, there is at least an institutional design in medical education that makes the school “own” bad student outcomes in a way that law schools (or undergraduate colleges) do not.
I have not found any systematic evaluation of these third-year academic support programs, though apparently many deans invest significant resources in the programs and believe they are helpful. One related study by Winston and others of academic support at Ross University is valuable and revealing. Ross, presumably like many medical schools, requires students who fail any courses to re-take the course and obtain a satisfactory grade. The authors found that if students simply re-enroll in a large course (say, re-enroll in biochemistry as a second-year after failing the course as a first-year student), the outcomes are not very good — students still have significant difficulty. But if the students enroll in a mandatory course targeted at those having academic difficulty, outcomes improve dramatically. One explanation of these findings is that the mandatory course, by creating a peer group of students having similar academic difficulties, and by teaching learning skills as well as substantive material, is addressing and largely solving the mismatch problem.Reference Winston, Van der Vlueten and Scherpbier36
It would be quite valuable to know more about what medical schools do to help students who fail Step 1 exams. Clearly, simply stretching out the time permitted to graduate is not enough; URMs who have difficulty on Step 1 exams often have difficulty on Step 2 exams,Reference Ripkey, Swanson and Case37 and there are substantial racial gaps in medical school graduation rates. Eventual graduation rates for non-blacks over the 2007 to 2014 graduation cycles were approximately 95%; for Blacks, the rate is approximately 85%, which means the rate of non-graduation is three times higher for Blacks than non-Blacks.38
Moreover, some types of mismatch effect may become evident only after students complete medical school. A 1987 RAND study of early medical school programs in affirmative action provides strong, though indirect, evidence of a mismatch effect in board certification. The three authors gathered data on all “minorities” who completed medical school in 1975, along with a sample of “nonminorities.”Reference Keith, Bell and Williams39 The authors were generally sympathetic to the goals of affirmative action, and positive about its effects — documenting, for example, the high incidence of same-race relationships between minority physicians and their patients. However, they noted the high rate at which minority physicians did not become board-certified: 49% of minority physicians were board-certified in a specialty, compared to 80% of non-minority physicians.40 Of particular relevance to the “mismatch” question is the following table:41
The “performance index” here is a measure combining information on student MCAT scores and undergraduate grades. Keith et al. calculated the performance index using a methodology developed by NBME staff to predict scores on what was, in the 1970s, the “Part II” board examination. In other words, they weighed MCAT scores and college science GPAs into a combined index that optimized prediction of board scores.Reference Rolph, Williams and Laniear42
It seems clear from Table 4 that a student’s performance index was a strong predictor of whether she achieved board certification; the certification rate for both minorities and non-minorities rises sharply and monotonically with performance index. However, it is also clear that after controlling for performance index, a large racial gap remains, which is conceptually the same thing as the “racial residual gap” I discussed earlier. The data in Table 4 implies that less than half of the overall difference in certification rates between minorities and whites is due to the lower average performance index of minorities entering medical school.43 The rest could well be due to mismatch.
Table 4 Specialty Board Certification Rate by Undergraduate Performance Index

Source: S. Keith, R. Bell, and A. Williams, Assessing the Outcome of Affirmative Action in Medical Schools (1984).
Compare, for example, this table from my 2005 analysis of law school data:
Tables 4 and 5 show remarkably similar patterns, and in the case of Table 5 we have a great deal of external evidence suggesting that the horizontal gaps between Black and white bar passage rates are due to mismatch. Law school mismatch, it seems, has roughly the same effect upon one’s chances of bar passage as subtracting 120 points from one’s academic index. In Table 4, minorities appear to have about the same chance of board certification as non-minorities with performance indices about 80 points lower. If this effect, too, is driven by mismatch, then large admissions preferences are greatly compounding an initial problem of preparation disparities by putting minority students in schools where their learning is compromised.
Table 5 First-time Bar Passage Rates by Undergraduate Academic Index

Data computed from R. Sander, ‘A Systemic Analysis of Affirmative Action in American Law Schools,’ Stanford Law Review 57, No. 2 (2004), pp. 367-484, at Table 6.2, p. 446.
So far as I can tell, the “mismatch” issue has never been specifically studied in a medical school context, and there is not even readily-available data to duplicate the RAND study for recent physician cohorts. It seems likely that the problem is less severe today, because the size of racial preferences in the 1970s was almost certainly larger than the preferences used today. Moreover, Table 4 is not, by itself, definitive proof that mismatch existed even in 1975. It is conceivable, for example, that minorities had lower board certification rates because, race aside, they came from lower SES backgrounds (see next section) or were generally different in some way that correlated with race. By far the best way to test for mismatch (when limited to observational data, rather than an actual experiment) is, in any case, not to use a racial surrogate, but to measure how far each student differs from the mean preparation level of her classmates and use that “mismatch” variable as one of several alternate predictors of outcomes, to see whether it has independent power.44
Nonetheless, even the modest evidence I have reviewed here — on NBME Step I pass disparities, on graduation disparities, and on specialty board certification rates — suggest that mismatch is quite plausibly a major issue undermining both individual careers and the profession’s half-century-old effort to diversify. So far as I can tell, academic medicine has not given any consideration to mismatch as a plausible explanation of why Blacks have, for many years, made up 8% of matriculating medical students, but only 5% of young doctors.
A key question about medical school mismatch is whether, to the extent it exists, it can be rectified by simply lowering the size of preferences used by at least some medical schools, without shrinking the number of minority matriculants in medical schools as a whole. This goes back to a question I raised in Part II, about the less hierarchical pattern of medical school admissions compared to law school admissions. Within the legal academy, if the top forty law schools greatly scaled back their use of racial preferences, the Blacks and Hispanics denied seats they would have received under the preference regime would still be highly competitive at many lower-ranked schools, so there is no intrinsic reason why the number of blacks entering law school would decline. Similarly, within the University of California, the end of racial preferences in 1998 produced mainly a reshuffling of Blacks and Hispanics across UC undergraduate campuses rather than a drop in overall URM enrollment (and even the small drop that occurred, as we shall see, was quickly offset by improved outreach). Whether the diversity of admissions standards across medical schools is great enough to similarly adjust to lower preferences without the loss of promising students is a question that could be answered with the sort of data collected by AAMC.
One of the challenges in addressing the mismatch problem — or even seriously investigating and discussing its possible existence — is the tendency of established interests to “shoot the messenger.” In the 1990s, the racial disparity in bar passage rates was considered an urgent and important problem; once the plausible idea was introduced that preferential admissions policies were in large part causing and seriously exacerbating the problem, discussion of the racial gap faltered, and releases of relevant data largely ground to a halt. Instead, pressure arose to ease grading curves and bar passage requirements, which were characterized as arbitrary barriers to diversity efforts.45 My sense, as an outside observer, is that similar pressures are operating — and indeed increasing — in the medical academy. A dramatic recent example is Norman Wang, a cardiologist at the University of Pittsburgh who published an article in the Journal of the American Heart Association in early 2020, analyzing affirmative action in medical schools (and in cardiology in particular). Although his article was apparently peer-reviewed and carefully researched, intense pressure arose for JAHA to retract the article – apparently solely for ideological reasons – which it did in August 2020. The University of Pittsburgh went a step further, removing Wang from an administrative position.46
Over the past five years, there has been a perceptible shift in the literature from articles documenting racial gaps on “step” exams and other measures of proficiency, to articles instead questioning the legitimacy of the “step” exams and other metrics of merit themselves, and questioning their utilization in such matters as selecting residents.Reference Teherani, Hauer, Fernandez, King and Lucey47 The recent move to change the Step I Boards to pass/fail grading, apparently motivated by arguments that scoring the boards undermined diversity efforts, exemplifies this shift in thinking.Reference Jones, Nichols, McNicholas and Stanford48
Of course, reducing underrepresentation by throwing away information is not a good solution. In the short term, it makes it more difficult to target academic support to those who performed badly on Step exams.Reference Makhoul, Pontell, Kumar and Drolet49 In the longer term, it undermines efforts to identify and remedy the sources of test score disparities, and to build better pipelines to medical school for underrepresented students. I agree with the critics of Step exams that the medical academy should do a better job of studying what factors or types of knowledge best predict high quality doctors, but that is a call for developing more and better information, not for censoring the information we currently have.
IV. The Conflict Between Race-Based Preferences and Socioeconomic Preferences
Most of the common rationales behind affirmative action concern its ability to “level the playing field,” to better represent the underrepresented, or to take into account individual hardship and obstacles overcome in assembling an incoming class. All of these rationales would seem better met by preferences based on socioeconomic status (“SES”) rather than race.50 As Barack Obama aptly put it in 2007, there was no good reason why his daughters should receive special preferences in college admissions.51 Using individual-level assessments of the circumstances applicants have actually faced in their lives makes more sense than using an intrinsic trait, like race, that embraces people who are advantaged as well as disadvantaged.
Several years ago, I analyzed data on parental education and occupation to assign SES scores to a nationally representative sample of young lawyers.Reference Sander52 I found that two-thirds of the lawyers came from households in the top quartile of SES, including 39% who came from households in the top tenth. Only 5% came from households in the bottom SES quartile. The share of top-quartile versus bottom-quartile lawyers was highest among whites (69% versus 4%), but it was high among all racial groups (for example, 53% versus 7% for Black lawyers). I also found that while the vast majority of law schools used racial preferences, almost none of the schools used SES preferences or even gathered systematic SES data from applicants; if anything, low SES appeared to be a disadvantage in the admissions process.53
What of medical schools? I have not found an attempt to create an “SES metric” for medical students or doctors, comparable to those used in my research and in many sociological studies, but the AAMC does collect systematic data on student backgrounds and has issued occasional reports.54 The data suggest that medical students come from even more privileged backgrounds than law students. Among medical students matriculating In 2008, 80% of the fathers had at least a bachelor degree, as did 76% of the mothers.Reference Guric, Garrison and Jolly55 By comparison, among lawyers who graduated from law school in 2000, 62% of the fathers and 50% of the mothers had at least a bachelor’s degree.56 As with Black lawyers, Black medical students were, on average, from less privileged backgrounds than white medical students, but nonetheless still mostly had upper-middle-class origins: 63% of their fathers, and 66% of their mothers, had at least a college degree.Reference Le57
In terms of income, 55% of medical students matriculating in 2006 had parents whose incomes placed them in the top quintile of American families. Only 4% came from families in the bottom quintile.Reference Youngclaus and Roskovensky58
From the standpoint of diversity, people from low-SES backgrounds are clearly less represented among the ranks of medical students (and hence, physicians) than are people who are racial minorities. While Blacks and Hispanics are underrepresented among the ranks of new physicians, relative to their numbers in the general population, by a factor of 2 or 2.5, people from low-SES backgrounds are underrepresented by a factor of 5 or more.59 Yet while nearly all medical schools collect data on the SES background of applicants, most do not appear to confer any significant SES preference. The only exceptions I have found are schools — like those in the UC system — that are legally enjoined from considering race.Reference Grbric60 As with law schools, racial preferences are not only much larger and more pervasive than class preferences; they also seem to deter schools from seriously considering class.
Given my discussion of the possible mismatch problem in medical education, the reader may wonder why I am implicitly suggesting here that medical schools consider affirmative action based on socioeconomic status. Two distinctions are important here. First, a good deal of research suggests that relatively modest socioeconomic preferences can substantially increase SES diversity; and (compared to very large preferences) modest preferences produce much less mismatch, or none at all.Reference Bowen, Kurzwell and Tobin61 Second, efforts to “build the pipeline,” such as I describe in the final section of this article, can expand diversity without the use of preferences at all — one focuses on expanding the pool of qualified students, rather than applying differential admissions standards to an existing pool.
V. Race is an Increasingly Misleading Phenotype
As has been often noted, race itself is more a social construct than a genetic characteristic. Yet in 1970, around when racial preferences were adopted by many institutions of American higher education, it was at least a meaningful construct in the sense that “non-whites” in America had widely experienced powerful disadvantages related to their assigned race. The black students who received preferences had, predominantly, two black parents and four black grandparents, and were highly likely to have ancestors who had been slaves in 19th century America.
That is far less true today. Since 1980, the number of foreign-born Blacks in the United States has quadrupled, and their share of the Black population has risen from 3.1% to 8.7%.Reference Anderson62 The share of Blacks who are the children of immigrants has, of course, risen correspondingly. The number of children born of parents of different races has also increased sharply; “multi-racial” persons constitute the fastest growing racial group in the United States.63 And because both non-native Blacks and multiracial Blacks have, on average higher test scores and stronger educational credentials than other Blacks, they make up a disproportionate share of the Blacks receiving preferential treatment. An analysis of Harvard Law School students in 2003 found that only about 30% of Black admittees were the children of two African-American parents.Reference Rimer and Arenson64
The issue has been compounded, of course, by the transition of the United States from a “two-race” society to a truly “multiracial” one. “Hispanics” — if they are considered to be a single racial group — now significantly outnumber Blacks by any measure. Yet the case for using preferences to achieve diversity varies widely across different ethnicities within the Hispanic umbrella, and indeed varies within specific ethnicities depending on lineage.
The point is that race has always been a problematic concept from a genotypic perspective; today in America it is becoming problematic from even a phenotypic perspective. The correlation between descriptive racial categories and whatever we think we are trying to achieve through preferences is highly, and increasingly, artificial.
VI. The Paradox Implicit in Pursuing “Educational Diversity” Benefits Through Large Preferences
My focus in this essay has been on the reasons why racial preferences are so difficult to implement in a fair and benign way. Implicitly, I have left on the table the presumed validity of the usual rationales given for affirmative action. But it is worth pointing out, briefly, that many of these are quite contestable.
For example, scholars at Duke University studied patterns of friendship at Duke over the four years of undergraduate education.Reference Arcidiacono, Aucejo, Hussey and Spenner65 At the beginning of freshman year, they found, students struck up friendships with a wide variety of people, mirroring to a significant degree the racial diversity at Duke. But over time, students’ friendship networks became more and more strongly associated with their academic performance at Duke; the stronger students made more durable friendships with other high-performing students, and weaker students made durable friendships with other weak-performing students. This was not in itself problematic; but since Duke used large racial preferences, the academic sorting produced racial sorting as well, so that by the senior year, Black students were socially segregated.
Though the authors did not examine the consequences of this academic-to-racial segregation, some plausible problems could easily follow. Blacks who in the first instance experience academic difficulty (not realizing how much the odds are stacked against them by large preferences), and in the second instance find that many of their Black friends are also struggling, and that friendships across racial lines become less common, could readily conclude that their college is systemically racist in some way. Whites, for their part, may well notice that Blacks they know, or observe in class, are disproportionately struggling, which could lead them to draw generalizations indistinguishable from racial stereotyping. In short, the use of large preferences can have precisely the opposite social and attitudinal effects that inspired affirmative action in the first place. It seems obvious that such counterproductive effects are much less likely to occur in the absence of large racial preferences.
The Prop 209/Prop 16 saga offers three lessons. First, we should be mindful that racial preferences are not a substitute for other forms of affirmative action, and that we often ignore those other forms when the seemingly easy route of racial preferences is open to us. Second, a focus on building pipelines, reducing science mismatch, and improving outcomes can be — as the UC experience unequivocally demonstrates — far more effective and healthier than a reliance on admissions preferences. But third, a great many educational leaders have a strong attachment to racial preferences, and a remarkable tendency to ignore (and discourage examination of) underlying data, so much so that they often represent the biggest single obstacle in the path of reform and racial progress.
VII. Conclusion: The Lessons of Prop 209 and Prop 16
As I stated at the outset, my goal in this article is not to judge affirmative action in medical school, but to identify some of the dynamics and plausible concerns that medical educators should understand, consider, and in a number of cases, investigate. What I know from the legal academy is that while it is easy to slip into a regime of racial preferences that soon develops significant downsides, it is hard to confront these downsides and even harder to reform them.
I think my findings do at least suggest the desirability of one of the national bodies of medical education (e.g., AAMC or NBME), or an informal association of several medical school deans, making an effort to generate a significant longitudinal database of medical school outcomes. This database would trace applicants through medical school, residency, board examinations. Ideally, this consortium would also create a modest fund to encourage research with this data. This would have a salutary effect upon transparency within the field and would make it at least possible to have concrete conversations about programs and trade-offs. Such steps would be consistent with the scientific spirit that is a touchstone of medical education.
I close, as I began, with the example of California’s Prop 209 and Prop 16, which illustrate both the challenges and potential for reform. In 1996, when voters passed Prop 209, many parts of the University of California system had been using very large racial admissions preferences. Year after year, Berkeley and UCLA (in particular) admitted very similar numbers of Black and Hispanic freshmen. Numerical goals for the entering class were achieved, but the pool of competitive candidates was stagnant rather than dynamic, and the outcomes for admitted URMs were generally dismal.
Prop 209 forced UC administrators to reconceive what they were doing. By 1998, the university had undertaken a massive and more traditional affirmative action effort. It invested many tens of millions of dollars in outreach to students in poor-performing schools, making students more familiar with UC’s admissions requirements (i.e., the specific high school courses required for admission), tutoring promising students, and creating partnerships between individual campuses and high schools. UC also undertook a broader use of socioeconomic preferences in admissions, though these remained far more modest than the earlier racial preferences.
The cumulative effect of these reforms was rapid and dramatic. Applications from Black — high school students in California, which had been flat from 1989 to 1997, tripled from 1997 to 2007. Hispanic applications rose even faster. Both groups were much better represented at UC, in both absolute and relative terms, by 2008 than at any time in the era of racial preferences. With URM students attending campuses where their credentials more closely matched their peers, mismatch effects declined sharply. Four-year graduation rates for URMs had doubled within a few years, and the number of URMs completing science degrees tripled. On nearly every metric, Black and Hispanic students at UC were better off in the race-neutral era. And UC achieved, in the process, an extraordinary level of socioeconomic diversity, with over a third of its students at many campuses receiving Pell Grants (which are roughly available to students whose parents are in the bottom half of the income distribution); no other top-ranked school has a Pell Grant rate higher than 22%.66
In this rosy picture, the UC system’s major and ironic failing was its unwillingness — perhaps even inability — to recognize and embrace the success of its own race-neutral affirmative action. Under pressure from Hispanic and Black state legislators and its own activist students to achieve fully proportional racial representation, UC administrators portrayed Prop 209 as a harmful impediment to their diversity efforts, and connived at changes to admissions processes that, often not very subtly, reintroduced racial preferences. When state legislators proposed Prop 16 (to repeal Prop 209), the UC Regents unanimously endorsed it.67
As someone who had done a lot of research on affirmative action in general, and on the UC experience in particular, I was regularly asked during the debate over Prop 16 to discuss the measure, debate university officials, and participate in town halls. I was struck by an extraordinary disconnect. Senior UC officials were disturbingly uninformed about how URM enrollment, graduation, grades, and STEM completion had risen after Prop 209; they seemed shocked by the statistics on UC’s own website. But regular voters got it; many of them had actually experienced the university’s outreach programs or had noticed the rise in graduation rates. Although Prop 16 was almost universally endorsed by establishment institutions, including the state’s major newspapers, and although the “yes” campaign had sixteen times the funding of the “no” campaign, it was emphatically rejected at the polls, losing by more than two million votes (42.8% to 57.2%). Strikingly, the vote was not particularly polarized across racial lines; opinion surveys indicated that a majority of Hispanics, and more than a third of Blacks, voted against Prop 16.
The Prop 209/Prop 16 saga offers three lessons. First, we should be mindful that racial preferences are not a substitute for other forms of affirmative action, and that we often ignore those other forms when the seemingly easy route of racial preferences is open to us. Second, a focus on building pipelines, reducing science mismatch, and improving outcomes can be — as the UC experience unequivocally demonstrates — far more effective and healthier than a reliance on admissions preferences. But third, a great many educational leaders have a strong attachment to racial preferences, and a remarkable tendency to ignore (and discourage examination of) underlying data, so much so that they often represent the biggest single obstacle in the path of reform and racial progress.
Note
The author has no conflicts of interest to disclose
Acknowledgements
I would like to thank Matthew Butterick, a 2005 UCLA Law School graduate who, in the fall of 2004, did valuable work comparing the effects of affirmative action in different types of professional school; and Dr. Robert Sade, who provide valuable feedback on a draft. Any errors in the piece are my own.