The Age of Big Data holds the promise of great discoveries, but Bentley et al. make a strong case that we'll need a good map if we want to avoid aimless wandering, and they outline an impressive candidate: their map of collective behavior (henceforth “BOB”). Recently, in Psychological Review, I offered a similar “map” of social influence based on a family of logistic threshold models called BOP (“Balance of Pressures” or “Burden of Proof”; MacCoun Reference MacCoun2012; also see Kerr & MacCoun Reference Kerr and MacCoun2012; MacCoun et al. Reference MacCoun, Cook, Muschkin and Vigdor2008). In this brief commentary, I compare and contrast the BOB and BOP maps, highlighting important points of convergence and possible divergence.
Until the early 20th century, alternative world maps conflicted, often profoundly. But the comparison of the BOB and BOP maps may be more analogous to the creative tension between alternative “projections” for transforming the 3D surface onto a 2D plane. Snyder (Reference Snyder1993, p. 1) notes that “literally an infinite number of map projections are possible … The designer of a map projection tries to minimize or eliminate some of the distortion, at the expense of more distortion of another type, preferably in a region of or off the map where distortion is less important.”
The BOP map
BOP (MacCoun Reference MacCoun2012) is a family of logistic threshold models sharing a common set of parameters. For example, the bBOP model is:
where p
Δ is the probability of changing one's position or practice, S/N is the proportion of the population that holds the opposite position (the “sources”), b is a threshold parameter, and c is a “norm clarity” parameter, inversely related to the standard deviation of the threshold. In some contexts, the numerator can be replaced by m, a ceiling parameter, with some loss of parsimony. Using a dozen different datasets, in MacCoun (Reference MacCoun2012) I compare the fit of the BOP models to various competitor models in the social psychology and social diffusion literatures, and show that the models provide a unifying framework for integrating research in the conformity, deliberation, helping, and social imitation/diffusion paradigms (see Fig. 1).
Figure 1. Locating parameters from 12 social influence data sets in the BOP parameter space. □ = conformity paradigm; ○ = helping paradigm; ▵ = deliberation paradigm; ⋄ = social imitation paradigm; × = theoretical social decision scheme landmarks. Source: MacCoun (Reference MacCoun2012, Fig. 11).
Points of convergence
Though they are parameterized somewhat differently, BOB and BOP are each variations on standard discrete choice models in the Luce/McFadden tradition, providing foundations in psychophysics, psychometrics, and econometrics. BOB's intensity parameter (b
t
, the “transparent–opaque” dimension) is similar to BOP's “clarity” dimension (c). Both are inversely related to the standard deviation of the choice arguments. And BOB's “strength of influence” parameter (J
t
) appears to be very similar to BOP's “ceiling” parameter (m).
Reasoning from the BOB map, Bentley et al. suggest that “(t)here are numerous indications that online behavior may be getting more herdlike” (target article, sect. 1, para. 5). Along similar lines, in MacCoun (Reference MacCoun2012) I use agent-based BOP modeling to offer a similar hypothesis, showing how herdlike behavior emerges as a result of an additional “vision” parameter representing the proportion of the total population whose views the agent is able to monitor. I speculate that web technologies combined with relentless polling is leading to dramatic expansions in vision.
Points of divergence
BOP decomposes the random utility model differently than BOB, creating a threshold parameter. This is the heart of the BOP model, because the threshold parameter allows one to assess asymmetric influence – the extent to which one side of an issue “holds the burden of social proof.” The threshold parameter allows BOP to capture the essence of both the Schelling (Reference Schelling1969) tipping point model (when b = .50 and clarity is high), and the Granovetter (Reference Granovetter1978) distributed thresholds model (any b, when clarity is low). And it allows the BOP map to provide explicit theoretical landmarks – Proportionality, Simple Majority, 2/3 Majority, Truth Wins, Truth-Supported Wins – as a point of comparison for empirical estimates (see Fig. 1).
Figure 2, from ongoing meta-analytic work, provides an illustration. (See supplementary appendix for data sources.) Plotted using BOP's coordinates, studies of intellective tasks (for which there is a demonstrably correct answer relative to some conceptual scheme) and studies of criminal jury deliberations appear to form two distinct “continents.” Both are a bit “north” and well “east” of the “proportionality” landmark. Both continents have asymmetrical thresholds, in which one faction bears a larger “burden of social proof.” The jury data has more asymmetry than one would expect by simple majority influence, apparently due to the reasonable doubt standard (see Kerr & MacCoun Reference Kerr and MacCoun2012). The intellective task data has greater asymmetry yet falls short of the “truth wins” landmark where a group will solve a problem if at least one member proposes the solution.
Figure 2. Plotted BOP parameters for nine criminal jury studies (▲) and seven intellective-task studies (△). Theoretical landmarks: PR = Proportionality decision scheme, MW = Majority Wins decision scheme, TW = Truth Wins decision scheme.
The BOP parameter space could also include the ceiling parameter (akin to BOB's “strength of influence” parameter). But only three of the several dozen datasets I've fit so far required such a parameter. Why? Partly because of sample selection, but it may be that even trivial individual decisions are susceptible to some social influence.
Attempts to fit BOP's clarity parameter imply that the scaling of BOB's b
t
from 0 to positive infinity may be misleading; in practice, the parameter has little detectable qualitative effect above about log10(c) = 2.5. This seems reassuring; no one wants to embark on a journey carrying an infinitely large map. Still, Bentley et al. offer intriguing arguments that qualitatively different models may be required in distinct regions of parameter space.
Conclusions
According to Monmonier (Reference Monmonier1996, p. 1), “not only is it easy to lie with maps, it's essential. To portray meaningful relationships … a map must distort reality.” In time, we will learn more about any distortions created by the BOB and BOP maps. Most but not all of the plotted data points in Figures 1 and 2 come from controlled experiments. But as we journey out into the deeper waters of big data, our parameter estimates will be increasingly susceptible to bias due to spurious correlations and causal endogeneity (see MacCoun et al. Reference MacCoun, Cook, Muschkin and Vigdor2008). So our explorations promise new discoveries, but for now we might annotate our maps with the ancient warning: “Here be dragons.”
The Age of Big Data holds the promise of great discoveries, but Bentley et al. make a strong case that we'll need a good map if we want to avoid aimless wandering, and they outline an impressive candidate: their map of collective behavior (henceforth “BOB”). Recently, in Psychological Review, I offered a similar “map” of social influence based on a family of logistic threshold models called BOP (“Balance of Pressures” or “Burden of Proof”; MacCoun Reference MacCoun2012; also see Kerr & MacCoun Reference Kerr and MacCoun2012; MacCoun et al. Reference MacCoun, Cook, Muschkin and Vigdor2008). In this brief commentary, I compare and contrast the BOB and BOP maps, highlighting important points of convergence and possible divergence.
Until the early 20th century, alternative world maps conflicted, often profoundly. But the comparison of the BOB and BOP maps may be more analogous to the creative tension between alternative “projections” for transforming the 3D surface onto a 2D plane. Snyder (Reference Snyder1993, p. 1) notes that “literally an infinite number of map projections are possible … The designer of a map projection tries to minimize or eliminate some of the distortion, at the expense of more distortion of another type, preferably in a region of or off the map where distortion is less important.”
The BOP map
BOP (MacCoun Reference MacCoun2012) is a family of logistic threshold models sharing a common set of parameters. For example, the bBOP model is:
where p Δ is the probability of changing one's position or practice, S/N is the proportion of the population that holds the opposite position (the “sources”), b is a threshold parameter, and c is a “norm clarity” parameter, inversely related to the standard deviation of the threshold. In some contexts, the numerator can be replaced by m, a ceiling parameter, with some loss of parsimony. Using a dozen different datasets, in MacCoun (Reference MacCoun2012) I compare the fit of the BOP models to various competitor models in the social psychology and social diffusion literatures, and show that the models provide a unifying framework for integrating research in the conformity, deliberation, helping, and social imitation/diffusion paradigms (see Fig. 1).
Figure 1. Locating parameters from 12 social influence data sets in the BOP parameter space. □ = conformity paradigm; ○ = helping paradigm; ▵ = deliberation paradigm; ⋄ = social imitation paradigm; × = theoretical social decision scheme landmarks. Source: MacCoun (Reference MacCoun2012, Fig. 11).
Points of convergence
Though they are parameterized somewhat differently, BOB and BOP are each variations on standard discrete choice models in the Luce/McFadden tradition, providing foundations in psychophysics, psychometrics, and econometrics. BOB's intensity parameter (b t , the “transparent–opaque” dimension) is similar to BOP's “clarity” dimension (c). Both are inversely related to the standard deviation of the choice arguments. And BOB's “strength of influence” parameter (J t ) appears to be very similar to BOP's “ceiling” parameter (m).
Reasoning from the BOB map, Bentley et al. suggest that “(t)here are numerous indications that online behavior may be getting more herdlike” (target article, sect. 1, para. 5). Along similar lines, in MacCoun (Reference MacCoun2012) I use agent-based BOP modeling to offer a similar hypothesis, showing how herdlike behavior emerges as a result of an additional “vision” parameter representing the proportion of the total population whose views the agent is able to monitor. I speculate that web technologies combined with relentless polling is leading to dramatic expansions in vision.
Points of divergence
BOP decomposes the random utility model differently than BOB, creating a threshold parameter. This is the heart of the BOP model, because the threshold parameter allows one to assess asymmetric influence – the extent to which one side of an issue “holds the burden of social proof.” The threshold parameter allows BOP to capture the essence of both the Schelling (Reference Schelling1969) tipping point model (when b = .50 and clarity is high), and the Granovetter (Reference Granovetter1978) distributed thresholds model (any b, when clarity is low). And it allows the BOP map to provide explicit theoretical landmarks – Proportionality, Simple Majority, 2/3 Majority, Truth Wins, Truth-Supported Wins – as a point of comparison for empirical estimates (see Fig. 1).
Figure 2, from ongoing meta-analytic work, provides an illustration. (See supplementary appendix for data sources.) Plotted using BOP's coordinates, studies of intellective tasks (for which there is a demonstrably correct answer relative to some conceptual scheme) and studies of criminal jury deliberations appear to form two distinct “continents.” Both are a bit “north” and well “east” of the “proportionality” landmark. Both continents have asymmetrical thresholds, in which one faction bears a larger “burden of social proof.” The jury data has more asymmetry than one would expect by simple majority influence, apparently due to the reasonable doubt standard (see Kerr & MacCoun Reference Kerr and MacCoun2012). The intellective task data has greater asymmetry yet falls short of the “truth wins” landmark where a group will solve a problem if at least one member proposes the solution.
Figure 2. Plotted BOP parameters for nine criminal jury studies (▲) and seven intellective-task studies (△). Theoretical landmarks: PR = Proportionality decision scheme, MW = Majority Wins decision scheme, TW = Truth Wins decision scheme.
The BOP parameter space could also include the ceiling parameter (akin to BOB's “strength of influence” parameter). But only three of the several dozen datasets I've fit so far required such a parameter. Why? Partly because of sample selection, but it may be that even trivial individual decisions are susceptible to some social influence.
Attempts to fit BOP's clarity parameter imply that the scaling of BOB's b t from 0 to positive infinity may be misleading; in practice, the parameter has little detectable qualitative effect above about log10(c) = 2.5. This seems reassuring; no one wants to embark on a journey carrying an infinitely large map. Still, Bentley et al. offer intriguing arguments that qualitatively different models may be required in distinct regions of parameter space.
Conclusions
According to Monmonier (Reference Monmonier1996, p. 1), “not only is it easy to lie with maps, it's essential. To portray meaningful relationships … a map must distort reality.” In time, we will learn more about any distortions created by the BOB and BOP maps. Most but not all of the plotted data points in Figures 1 and 2 come from controlled experiments. But as we journey out into the deeper waters of big data, our parameter estimates will be increasingly susceptible to bias due to spurious correlations and causal endogeneity (see MacCoun et al. Reference MacCoun, Cook, Muschkin and Vigdor2008). So our explorations promise new discoveries, but for now we might annotate our maps with the ancient warning: “Here be dragons.”
SUPPLEMENTARY MATERIALS
To view supplementary material for this article, please visit http://dx.doi.org/10.1017/S0140525X13001787.