We believe that Firestone & Scholl's (F&S's) target article represents a commendable standard for evaluating top-down effects, and we agree with many of their recommendations. We also agree that it is possible to overstate the power of fleeting abstractions to impact our immediate impression of the world, and we are skeptical of the view that perception is so saturated with belief that we can see whatever we wish. However, we also know from many decades of research that perception integrates sensory input with reliable world-knowledge. To deny such evidence would be to deny that humans are flexible learners. This is where we probably diverge from F&S: We do not think that evidence for top-down processes represents a surprising or dramatic departure from established theory, nor that the top-down hypothesis requires a higher standard of proof than any other hypothesis about the functioning of the mind in physical and social space.
Here, we focus on the impact of race on the perception of the lightness of faces. Levin and Banaji (Reference Levin and Banaji2006) demonstrated that participants seem to perceive Black faces to be darker than White faces. This effect was present both when participants adjusted samples to match unambiguous Black and White faces, and when one group of participants judged an ambiguous face that they were told was Black while another group saw the same face, this time believing that it was White. In the target article, F&S review previous experiments (Firestone & Scholl Reference Firestone and Scholl2015a) that focus on the former effect with unambiguous faces, arguing that it was a qualitatively more effective demonstration of a top-down effect. However, they also argued that this finding suffered a potential stimulus confound. F&S tested this confound using blurred faces (the left half of Fig. 1) to measure whether participants who were nominally unable to identify the faces by race still showed the lightness illusion (e.g., participants who indicated that the faces were of the same race still judged the White face to be lighter).
Figure 1 Illustration of accuracy on forced-choice identification of race for blurred stimuli employed by Firestone and Scholl (2015a). Firestone and Scholl argued that participants who could not identify the race of the stimuli on left nonetheless showed a brightness effect. However, we demonstrated that participants were able to accurately judge the race of the faces both for the original blurs and in luminance-inverted versions of the faces (Baker and Levin Reference Baker and Levin2016).
In our response (Baker and Levin Reference Baker and Levin2016) we noted that F&S used a forced-choice question to obtain judgments about which face was darker, and used a nonforced choice to assess detection of race (participants selected from a menu of possible races independently for each face). What if the forced-choice lightness question was more sensitive to lightness than the race-detection question was to race? F&S may have underestimated participants' ability to detect race in the blurred faces, and may therefore have falsely classified some participants as unable to detect race. This seems particularly plausible given that the classification was based on one or two judgments about subtle, near-threshold information. Indeed, when we included a forced-choice question that directly asked participants to choose which face was White and which was Black, we repeatedly observed that 75%–80% of participants correctly assigned race. It is important to note that participants were just as successful in detecting the race of the faces when the faces were contrast-inverted (Fig. 1), so it seems unlikely that they detected the race of the faces by noting the brightness confound and by guessing that the lighter face was White.
In addition to evidence that the blurring left some race-specifying information in the images, Baker and Levin found that participants who correctly distinguished the race of the noninverted faces also were more likely to judge that the Black face was darker. This result supports the hypothesis that there is a relationship between participants' ability to perceive the race of the faces and how light each face appears to them.
Space constraints prevent us from reviewing all of F&S's critique and all of the logic underlying our response, but the complexity of the issue leads us to a key point: The original Levin and Banaji report foresaw the difficulty in fully eliminating confounds inherent to two different stimuli, and so it included the abovementioned ambiguous face experiment, along with an experiment in which RT on same–different judgments was slowed when a relatively lightened Black face was compared with a White face (thus equalizing their apparent lightnesses). These additional experiments cast serious doubt on F&S's conclusion “that the initial demonstration of Levin and Banaji (Reference Levin and Banaji2006) provides no evidence for a top-down effect on perception” (sect 4.4.1, para. 5; emphasis added). The casual reader might be forgiven for assuming that Levin and Banaji's entire study can be dismissed unless they realize that the word “initial” means that only one of several experiments are at issue and read the footnote describing one of these other experiments. We think that this quote reveals a fundamental problem with F&S's approach. The categorical conclusion implies that experiments must either provide unambiguous proof of top-down effects by avoiding all of the pitfalls they describe, or the work falls to zero weight in tipping the scale to the top-down side of a debate that is complex enough to have been raging for a long time.
We prefer a more nuanced approach to advancing research on this topic for several reasons. First, there are many different kinds of top-down effects, some in which momentary thoughts influence how things look, and some more subtle effects where a more-sophisticated perceptual process influences a less-sophisticated one, perhaps as the result of long-term experience. This is especially evident in the social domain, where category-informed reactions to skin color can clearly be consequential. Of course, researchers' specific interests might lead them to isolate the truly perceptual sources of judgments about experience, but at some point it becomes an exercise in purity that provides license to focus exclusively on relatively artificial stimuli and tasks designed a priori to reveal phenomena that will confirm evidence of bottom-up processing. In all cases rigor is crucial, and F&S provide some good recommendations in achieving that. But rigor should not be an excuse to ignore the study of important phenomena. We believe that discovery is best served by exploring the full richness of human perceptual capacities that may or may not reveal cognitive penetration rather than dwelling exclusively on simpler perceptual process from a penchant for tidiness.
We believe that Firestone & Scholl's (F&S's) target article represents a commendable standard for evaluating top-down effects, and we agree with many of their recommendations. We also agree that it is possible to overstate the power of fleeting abstractions to impact our immediate impression of the world, and we are skeptical of the view that perception is so saturated with belief that we can see whatever we wish. However, we also know from many decades of research that perception integrates sensory input with reliable world-knowledge. To deny such evidence would be to deny that humans are flexible learners. This is where we probably diverge from F&S: We do not think that evidence for top-down processes represents a surprising or dramatic departure from established theory, nor that the top-down hypothesis requires a higher standard of proof than any other hypothesis about the functioning of the mind in physical and social space.
Here, we focus on the impact of race on the perception of the lightness of faces. Levin and Banaji (Reference Levin and Banaji2006) demonstrated that participants seem to perceive Black faces to be darker than White faces. This effect was present both when participants adjusted samples to match unambiguous Black and White faces, and when one group of participants judged an ambiguous face that they were told was Black while another group saw the same face, this time believing that it was White. In the target article, F&S review previous experiments (Firestone & Scholl Reference Firestone and Scholl2015a) that focus on the former effect with unambiguous faces, arguing that it was a qualitatively more effective demonstration of a top-down effect. However, they also argued that this finding suffered a potential stimulus confound. F&S tested this confound using blurred faces (the left half of Fig. 1) to measure whether participants who were nominally unable to identify the faces by race still showed the lightness illusion (e.g., participants who indicated that the faces were of the same race still judged the White face to be lighter).
Figure 1 Illustration of accuracy on forced-choice identification of race for blurred stimuli employed by Firestone and Scholl (2015a). Firestone and Scholl argued that participants who could not identify the race of the stimuli on left nonetheless showed a brightness effect. However, we demonstrated that participants were able to accurately judge the race of the faces both for the original blurs and in luminance-inverted versions of the faces (Baker and Levin Reference Baker and Levin2016).
In our response (Baker and Levin Reference Baker and Levin2016) we noted that F&S used a forced-choice question to obtain judgments about which face was darker, and used a nonforced choice to assess detection of race (participants selected from a menu of possible races independently for each face). What if the forced-choice lightness question was more sensitive to lightness than the race-detection question was to race? F&S may have underestimated participants' ability to detect race in the blurred faces, and may therefore have falsely classified some participants as unable to detect race. This seems particularly plausible given that the classification was based on one or two judgments about subtle, near-threshold information. Indeed, when we included a forced-choice question that directly asked participants to choose which face was White and which was Black, we repeatedly observed that 75%–80% of participants correctly assigned race. It is important to note that participants were just as successful in detecting the race of the faces when the faces were contrast-inverted (Fig. 1), so it seems unlikely that they detected the race of the faces by noting the brightness confound and by guessing that the lighter face was White.
In addition to evidence that the blurring left some race-specifying information in the images, Baker and Levin found that participants who correctly distinguished the race of the noninverted faces also were more likely to judge that the Black face was darker. This result supports the hypothesis that there is a relationship between participants' ability to perceive the race of the faces and how light each face appears to them.
Space constraints prevent us from reviewing all of F&S's critique and all of the logic underlying our response, but the complexity of the issue leads us to a key point: The original Levin and Banaji report foresaw the difficulty in fully eliminating confounds inherent to two different stimuli, and so it included the abovementioned ambiguous face experiment, along with an experiment in which RT on same–different judgments was slowed when a relatively lightened Black face was compared with a White face (thus equalizing their apparent lightnesses). These additional experiments cast serious doubt on F&S's conclusion “that the initial demonstration of Levin and Banaji (Reference Levin and Banaji2006) provides no evidence for a top-down effect on perception” (sect 4.4.1, para. 5; emphasis added). The casual reader might be forgiven for assuming that Levin and Banaji's entire study can be dismissed unless they realize that the word “initial” means that only one of several experiments are at issue and read the footnote describing one of these other experiments. We think that this quote reveals a fundamental problem with F&S's approach. The categorical conclusion implies that experiments must either provide unambiguous proof of top-down effects by avoiding all of the pitfalls they describe, or the work falls to zero weight in tipping the scale to the top-down side of a debate that is complex enough to have been raging for a long time.
We prefer a more nuanced approach to advancing research on this topic for several reasons. First, there are many different kinds of top-down effects, some in which momentary thoughts influence how things look, and some more subtle effects where a more-sophisticated perceptual process influences a less-sophisticated one, perhaps as the result of long-term experience. This is especially evident in the social domain, where category-informed reactions to skin color can clearly be consequential. Of course, researchers' specific interests might lead them to isolate the truly perceptual sources of judgments about experience, but at some point it becomes an exercise in purity that provides license to focus exclusively on relatively artificial stimuli and tasks designed a priori to reveal phenomena that will confirm evidence of bottom-up processing. In all cases rigor is crucial, and F&S provide some good recommendations in achieving that. But rigor should not be an excuse to ignore the study of important phenomena. We believe that discovery is best served by exploring the full richness of human perceptual capacities that may or may not reveal cognitive penetration rather than dwelling exclusively on simpler perceptual process from a penchant for tidiness.