We agree with Yarkoni's thesis that there is a “generalizability crisis” and that the mapping between verbal theoretical constructs and measures and models is the source of many difficulties. In particular, the limited variation in procedures, stimuli, contexts, and measures represents a significant challenge to generalizability. Yarkoni summarizes these concerns by suggesting that “a huge proportion of the quantitative inferences drawn in the published psychology literature are so weak as to be at best questionable and at worst utterly nonsensical.”
Although Yarkoni's arguments are compelling, we don't fully agree with the somewhat gloomy picture he paints. The generalizability crisis creates something of a paradox: If generalization claims are on such shaky grounds, why is it that many phenomena are so robust that they make for reliable classroom demonstrations and/or have been shown to have substantial practical significance?
With respect to the former, examples include a number of judgment and decision biases identified and analyzed by Kahneman, Tversky, Fischhoff, Slovic, Loewenstein, Weber, and others (e.g., availability heuristic, loss aversion, framing effects, quantity insensitivity). With respect to the latter, Cialdini (Reference Cialdini2009a, Reference Cialdini2009b) has demonstrated simple but effective manipulations that increase environmentally friendly behaviors (e.g., hotel guests reusing towels). Similarly, implementing changes default assumptions (Thaler & Sunstein, Reference Thaler and Sunstein2008) has been shown to facilitate policy goals such as increasing organ donation.
Field versus lab
We suggest that attention to the field is a critical factor supporting both relevance and generalizability. Those involved in lab research usually aim to demonstrate the presence of a particular effect, and tend to be motivated to create a specific environment or context to observe it. Lab researchers have an unlimited number of levers to establish conditions which will maximize the chances for observing desired effects. Rigorous control procedures can be implemented that are not feasible outside the lab. But this precise control may be exactly what limits generalizability.
Field researchers face the opposite problem. They typically work in environments which can be changed very little, and with populations they rarely can preselect. Field/applied researchers are routinely motivated to search for effects and manipulations which are robust enough to work in their specific context. Field research may operate as a “generalizability filter” separating tenuous effects from interventions with a higher chance for success.
Judgment and decision-making research may have benefited from the fact that much of it has been done in business schools. Business school faculty rarely have access to a “subject pool” and they tend to rely on both studies in classrooms and in the field. The participants in business school studies often are students who have experience in the business world and are seeking MBAs (or PhDs). This is just one factor that serves to increase the likelihood that research by business school faculty will make connections with corporate contexts.
Consider, for example, “sunk cost” effects. Sunk costs refer to situations where commitment of resources is continued and escalated beyond any rational considerations because one doesn't want to “waste” the prior investment. This is sometimes referred to as “throwing good money after bad.” The interest in sunk cost effects originated with real-world examples. But a careful analysis of generalizability suggests that there are other situations where the opposite of sunk cost effects can be shown (prematurely withdrawing an investment just before it starts to pay off; e.g., Drummond, Reference Drummond2014; Heath, Reference Heath1995). Instead of undermining the sunk costs construct, such findings invite attention to what factors are associated with each type of outcome. For instance, sunk cost effects for money may be different from sunk cost effects for time (Cunha, Marcus, & Caldieraro, Reference Cunha and Caldieraro2009; Soman, Reference Soman2001).
Field research may also serve as a direct test of generalizability of lab findings. For example Hofmann, Wisneski, Brant, and Skitka (Reference Hofmann, Wisneski, Brant and Skitka2014) used text messaging at varied times to assess everyday moral and immoral acts and experiences. They found moral experiences to be common and, they observed both moral licensing and moral contagion, effects that previously had been shown in lab studies.
This interplay between lab and field is useful to both. Although generalizability is important, it could be argued that variability is even more fundamental. At the heart of social science is the search for patterned variation, variation that our theories seek to understand. Attention to the field may serve to increase attention to potential interactions and undermine a main effect focus.
Field as a source of complementary evidence
As Yarkoni notes, conceptual replications (as opposed to exact replications) put assumptions of generalizability to the test and represent an effective research strategy. They also are a key tool in establishing construct validity (e.g., Grahek, Schaller, & Tackett, Reference Grahek, Schaller and Tackett2021), linking theory and measures.
Field observation offers a complementary form of converging measure that can be an important research tool. For example, lab studies suggesting that participants see nature as incompatible with human presence (nature is pristine and humans can enjoy it but are not part of it) can be complemented by analyses using Google images. For example, a search of images for “ecosystems” found that humans were present only two percent of the time and for about half of that two percent humans were outside the system looking in (Medin & Bang, Reference Medin and Bang2014). Similarly, experimental observations suggesting cultural differences in subjective proximity to nature (Bang, Medin, & Atran, Reference Bang, Medin and Atran2007) may be complemented by corresponding differences in illustrations in children's books (Bang et al., in press).
An additional benefit of complementary field observations is that they facilitate analyzing changes over time (Iliev & Ojalehto, Reference Iliev and Ojalehto2015). For example, claims about increasing cultural individualism may be paralleled by corresponding changes in cultural artifacts (Greenfield, Reference Greenfield2013, Reference Greenfield2017). In short, field observations invite complementary and coordinated observations both as a stimulus for new studies and as a guide to robustness of findings.
We agree with Yarkoni's thesis that there is a “generalizability crisis” and that the mapping between verbal theoretical constructs and measures and models is the source of many difficulties. In particular, the limited variation in procedures, stimuli, contexts, and measures represents a significant challenge to generalizability. Yarkoni summarizes these concerns by suggesting that “a huge proportion of the quantitative inferences drawn in the published psychology literature are so weak as to be at best questionable and at worst utterly nonsensical.”
Although Yarkoni's arguments are compelling, we don't fully agree with the somewhat gloomy picture he paints. The generalizability crisis creates something of a paradox: If generalization claims are on such shaky grounds, why is it that many phenomena are so robust that they make for reliable classroom demonstrations and/or have been shown to have substantial practical significance?
With respect to the former, examples include a number of judgment and decision biases identified and analyzed by Kahneman, Tversky, Fischhoff, Slovic, Loewenstein, Weber, and others (e.g., availability heuristic, loss aversion, framing effects, quantity insensitivity). With respect to the latter, Cialdini (Reference Cialdini2009a, Reference Cialdini2009b) has demonstrated simple but effective manipulations that increase environmentally friendly behaviors (e.g., hotel guests reusing towels). Similarly, implementing changes default assumptions (Thaler & Sunstein, Reference Thaler and Sunstein2008) has been shown to facilitate policy goals such as increasing organ donation.
Field versus lab
We suggest that attention to the field is a critical factor supporting both relevance and generalizability. Those involved in lab research usually aim to demonstrate the presence of a particular effect, and tend to be motivated to create a specific environment or context to observe it. Lab researchers have an unlimited number of levers to establish conditions which will maximize the chances for observing desired effects. Rigorous control procedures can be implemented that are not feasible outside the lab. But this precise control may be exactly what limits generalizability.
Field researchers face the opposite problem. They typically work in environments which can be changed very little, and with populations they rarely can preselect. Field/applied researchers are routinely motivated to search for effects and manipulations which are robust enough to work in their specific context. Field research may operate as a “generalizability filter” separating tenuous effects from interventions with a higher chance for success.
Judgment and decision-making research may have benefited from the fact that much of it has been done in business schools. Business school faculty rarely have access to a “subject pool” and they tend to rely on both studies in classrooms and in the field. The participants in business school studies often are students who have experience in the business world and are seeking MBAs (or PhDs). This is just one factor that serves to increase the likelihood that research by business school faculty will make connections with corporate contexts.
Consider, for example, “sunk cost” effects. Sunk costs refer to situations where commitment of resources is continued and escalated beyond any rational considerations because one doesn't want to “waste” the prior investment. This is sometimes referred to as “throwing good money after bad.” The interest in sunk cost effects originated with real-world examples. But a careful analysis of generalizability suggests that there are other situations where the opposite of sunk cost effects can be shown (prematurely withdrawing an investment just before it starts to pay off; e.g., Drummond, Reference Drummond2014; Heath, Reference Heath1995). Instead of undermining the sunk costs construct, such findings invite attention to what factors are associated with each type of outcome. For instance, sunk cost effects for money may be different from sunk cost effects for time (Cunha, Marcus, & Caldieraro, Reference Cunha and Caldieraro2009; Soman, Reference Soman2001).
Field research may also serve as a direct test of generalizability of lab findings. For example Hofmann, Wisneski, Brant, and Skitka (Reference Hofmann, Wisneski, Brant and Skitka2014) used text messaging at varied times to assess everyday moral and immoral acts and experiences. They found moral experiences to be common and, they observed both moral licensing and moral contagion, effects that previously had been shown in lab studies.
This interplay between lab and field is useful to both. Although generalizability is important, it could be argued that variability is even more fundamental. At the heart of social science is the search for patterned variation, variation that our theories seek to understand. Attention to the field may serve to increase attention to potential interactions and undermine a main effect focus.
Field as a source of complementary evidence
As Yarkoni notes, conceptual replications (as opposed to exact replications) put assumptions of generalizability to the test and represent an effective research strategy. They also are a key tool in establishing construct validity (e.g., Grahek, Schaller, & Tackett, Reference Grahek, Schaller and Tackett2021), linking theory and measures.
Field observation offers a complementary form of converging measure that can be an important research tool. For example, lab studies suggesting that participants see nature as incompatible with human presence (nature is pristine and humans can enjoy it but are not part of it) can be complemented by analyses using Google images. For example, a search of images for “ecosystems” found that humans were present only two percent of the time and for about half of that two percent humans were outside the system looking in (Medin & Bang, Reference Medin and Bang2014). Similarly, experimental observations suggesting cultural differences in subjective proximity to nature (Bang, Medin, & Atran, Reference Bang, Medin and Atran2007) may be complemented by corresponding differences in illustrations in children's books (Bang et al., in press).
An additional benefit of complementary field observations is that they facilitate analyzing changes over time (Iliev & Ojalehto, Reference Iliev and Ojalehto2015). For example, claims about increasing cultural individualism may be paralleled by corresponding changes in cultural artifacts (Greenfield, Reference Greenfield2013, Reference Greenfield2017). In short, field observations invite complementary and coordinated observations both as a stimulus for new studies and as a guide to robustness of findings.
Financial support
This research received no specific grant from any funding agency, commercial or not-for-profit sectors.
Conflict of interest
None.