Experiments are seen as a gold standard because randomized assignment offers unparalleled internal validity. An important and less studied concern is ecological validity, or the “realism” of the experiment. Ecological validity is vital because studies that are too artificial may not speak to any real-world political phenomenon. In particular, the recognition that in the real-world individuals self-select their own “treatments” raises a concern that randomized exposure may not constitute a realistic exploration of political communication (Arceneaux and Johnson Reference Arceneaux and Johnson2012; Druckman et al. Reference Druckman, Fein and Leeper2012; Gaines and Kuklinski Reference Gaines and Kuklinski2011a, Reference Gaines, Kuklinski, Druckman, Green, Kuklinski and Lupia b ; Lau and Redlawsk Reference Lau and Redlawsk2006). Given this self-selection, what can we learn from a randomized experiment about the effect of a message on those individuals who would—if given the choice—opt into or out of receiving it? To answer this, I use a population-based survey experiment to test how individuals respond to policy arguments that are either randomly assigned or self-selected by participants. The findings show that treatment randomization masks effect heterogeneity across individuals inclined to select alternative messages, with the results hinging on how issue importance drives different subsets of respondents to self-select each message.
EFFECTS OF SELF-SELECTED POLITICAL COMMUNICATIONS
Arguments conveyed by media and political elites are widely seen as an important source of information that citizens can use when forming preferences (Chong and Druckman Reference Chong and Druckman2007; Disch Reference Disch2011). Evidence of this influence comes largely from experiments that expose participants to different messages and measure effects on argument evaluations, opinions, issue importance, information-seeking, etc. (see Ansolabehere et al. Reference Ansolabehere, Iyengar, Simon and Valentino1994; Arceneaux and Johnson Reference Arceneaux and Johnson2012; Berinsky and Kinder Reference Berinsky and Kinder2006; Brewer and Gross Reference Brewer and Gross2005; Iyengar and Kinder Reference Iyengar and Kinder1987; Miller and Krosnick Reference Miller and Krosnick2000; Nelson et al. Reference Nelson, Clawson and Oxley1997; Petty and Cacioppo Reference Petty and Cacioppo1986). Questions have been raised, however, about these kinds of studies given that, in Hovland’s words: “In an experiment the audience on whom the effects are being evaluated is one which is fully exposed to the communication. On the other hand, in naturalistic situations with which surveys are typically concerned, the outstanding phenomenon is the limitation of the audience to those who expose themselves to the communication” (Bennett and Iyengar Reference Bennett and Iyengar2008, 724; Hovland Reference Hovland1959, 9). A randomized experiment cannot identify the effect of a message for those who chose to view it or the effect for those who chose not to view it. Instead, it can identify only the sample average treatment effect (SATE), which averages the effects for the two subgroups that choose to be treated and choose to be untreated.
Hovland encouraged researchers to focus on these separate effects for “those who expose themselves to the communication” and those who do not. Why would we care about the treatment effects for these groups? Consider the classic experiment by Nelson et al. (Reference Nelson, Clawson and Oxley1997) in which participants were assigned to watch a story framing a rally in terms of either free speech or public safety. While we may care about the SATE in this case, our interest in understanding citizens’ interactions with media suggests that we also want to know how that message affects different segments of the public. How are those who opt-in to free speech framed news affected by it? How are those who would rather opt-in to public order news affected by free speech news? The answers to these questions matter because much exposure is selective in this way. Indeed, it is widely known that citizens selectively expose themselves to information (Bennett and Iyengar Reference Bennett and Iyengar2008; Bolsen and Leeper Reference Bolsen and Leeper2013; Feldman et al. Reference Feldman, Maibach, Roser-Renouf and Leiserowitz2011; Garrett Reference Garrett2009a, Reference Garrettb; Garrett et al. Reference Garrett, Carnahan and Lynch2013; Iyengar and Hahn Reference Iyengar and Hahn2009; Iyengar et al. Reference Iyengar, Hahn, Krosnick and Walker2008; Kim Reference Kim2007; Smith et al. Reference Smith, Fabrigar and Norris2008; Stroud Reference Stroud2011), and meta-analysis suggests that political selective exposure is especially potent (Hart et al. Reference Hart, Albarracín, Eagly, Brechan, Lindberg and Merrill2009). While individuals select messages based on prior attitudes, they also appear to engage in selective exposure according to ideology, habit, and topical interests (Baum Reference Baum2002; Bennett and Iyengar Reference Bennett and Iyengar2008; Iyengar and Hahn Reference Iyengar and Hahn2009; Prior Reference Prior2007). Given this selectivity, there is value in knowing how individuals are affected by treatments they actually encounter in the real world.
In what ways might the effect of a communication differ across individuals inclined and disinclined to select it? There are two possibilities. One is homogeneous effects: the effects of exposure to a communication are the same for those who would choose the message as for those who would not. In the case of such homogeneity, the SATE averages the effects for both groups and there are no limitation to what can be learned from a randomized experiment. Thus, homogeneity occurs if messages are uniformly influential or if a message has no effect on anyone.
The other possibility is heterogeneous effects: the effects of exposure to a message are different for those who would choose and not choose the message. Such effect heterogeneity might occur because of motivated reasoning (Kunda Reference Kunda1990; Taber and Lodge Reference Taber and Lodge2006), wherein individuals tend to select attitude-reinforcing messages and avoid attitude-incongruent arguments, as well as see attitude-congruent arguments as stronger and more effective than incongruent arguments (Ditto et al. Reference Ditto, Scepansky, Munro, Apanovitch and Lockhart1998). As such, the effects of communications for message selectors and non-selectors on numerous outcomes—including argument evaluations, opinions, and willingness to acquire more issue-relevant information — are likely to point in opposite directions.
Given the prevalence of motivated thinking, it is reasonable to expect effect heterogeneity rather than homogeneity. To foreshadow and thereby situate that general expectation in the specific design used here, exposure to a message supportive of a policy should be expected to increase support for that policy. Yet, that increase in support—and attendant positive message evaluations and interest in the issue—should apply only to those inclined to receive the message and not to those disinclined to choose the message. Further, because issue importance is one mechanism thought to affect the degree of selective exposure (with high importance linked to greater attitude-congruent message exposure; Holbrook et al. Reference Holbrook, Berent, Krosnick, Visser and Boninger2005; Leeper Reference Leeper2014; Taber and Lodge Reference Taber and Lodge2006), higher importance should exacerbate this heterogeneity and lower importance should mitigate it (given the greater similarity between the audiences for different messages).
EXPERIMENTAL DESIGN
I use a design combining randomized exposure and message self-selection to test for this expected pattern of effect heterogeneity. One-half of participants are assigned to a randomized experiment and the other half participate in an observational study involving treatment self-selection (see Gaines and Kuklinski Reference Gaines and Kuklinski2011a). Expanding past work, I also employ three additional features: (1) a pre-treatment manipulation of issue importance to modify the degree of attitude-congruent selective exposure, (2) a two-wave panel design, and (3) a population-based sample of participants. The issue under investigation is so-called “renewable energy portfolio” standards, which require electrical utilities to produce energy from renewable resources.Footnote 1 The first wave of the study occurred in Summer 2010 (hereafter, t1) to measure demographics and baseline attitudes. The experimental wave was collected in Spring 2011 (hereafter, t2). The two-wave design is advantageous because it provides a clean measure of t1 opinion, avoiding accessibility or consistency biases into respondents’ behavior during the t2 experiment and enables estimation of opinion effects using within-subjects, pre-/post-treatment changes.
Respondents
Data were collected by Bovitz Research Group of Encino, CA, that provide an online panel of approximately one million respondents recruited through random digit dialing and empanelment of those with internet access. As with most internet survey samples, respondents participate in multiple surveys over time and receive compensation for their participation. A total of 879 respondents completed both waves and analysis is restricted to these respondents. A total of 885 respondents completed the first wave, suggesting there is little concern about sample attrition. The sample was drawn to represent the U.S. adult population and data are analyzed without weighting. Respondents had a median age of 49, and were 49.0% female, 75.0% white, and 98.9% had at least high-school degrees and 80.9% had university degrees. The partisan composition was 39.2% Democrats and 30.7% Republicans.Footnote 2
Manipulations
The t2 experiment involved three manipulations: issue importance, the direction of the policy argument, and whether that argument was self-selected or randomly assigned. Figure 1 provides a visual summary of the design, with treatment group sizes and notation used in defining causal effects.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20171026065818-89148-mediumThumb-S205226301700001X_fig1g.jpg?pub-status=live)
Figure 1 Experimental Design and Treatment Group Sample Sizes. Differences in Sample Sizes Within the Captive Conditions Reflect Random Assignment. Differences in Smple Sizes Between “Chose Pro” and “Chose Con” Conditions Reflect Treatment Self-Selection. Choice Conditions were Intentionally Oversampled. The
$\bar{Y}$
Values at Right are Meant to Clarify What Groups are Used for Estimating Treatment Effects.
The first manipulation modified the personal impact of the energy proposal. This serves as an instrument for respondents’ information choices. The logic of the manipulation was that individuals who believe their self-interest is at-stake would display higher importance and be more likely to choose an opinion-congruent message. Importance was manipulated to be high by telling respondents:
A new law is currently moving through Congress that would require your electricity provider to purchase energy from renewable sources (e.g. wind and solar). This is relevant to you since it will influence your energy bills and the environment. The law would go into effect immediately.
Those in the low-importance condition read:
Some have proposed a bill that would require electricity providers to purchase energy from renewable sources (e.g. wind and solar). This is probably not directly relevant to you because Congress does not appear to be ready to act on the bill and even if they did it is unlikely to personally affect you.
A manipulation check asked respondents “How important to you personally is your opinion about this renewable energy restriction?” and the results confirm that importance was manipulated. On a 0–1 scale, those in high-importance conditions averaged 0.77 and those in low-importance conditions averaged 0.69, a statistically significant difference (p = 0.00).Footnote 3
The second manipulation presented participants with either a message supportive (Pro) or opposed (Con) to the policy. The Pro message was entitled “Renewable Energy Rules Beneficial” and the Con message was entitled “Renewable Energy Rules Ineffective.”Footnote 4 An effort was made to ensure that the informational content of the Pro and Con messages was near-identical. The difference between the two treatments is in the language chosen to describe the same basic facts.Footnote 5
The final manipulation involved how the informational treatments were assigned. One-third of the respondents were randomly assigned to read either the Pro or Con argument. The other two-thirds were presented with the headlines for each passage and told to choose one to read, which they were then given.Footnote 6
Measures
Outcome measures included evaluations of the information received, attitude toward the policy, subjective intentions to obtain further information, and a behavioral measure directly tapping willingness to receive additional information in the form of an email message. All variables are coded to range from 0 to 1, with higher scores indicating more positive evaluations, higher policy support, or greater intention to seek information. The opinion question read, “Thinking about energy related restrictions, to what extent do you oppose or support requiring electricity providers to purchase energy generated from renewable sources (e.g., wind, solar)?” and solicited responses on a seven-point scale from “strongly opposed” to “strongly support.” As already mentioned, this item was also measured on the t1 survey, enabling estimation of treatment effects for opinion using both post-treatment and pre-/post-changes. The argument evaluation question read, “How effective would you say the information you read was in making an argument about this energy-related restriction?” The subjective information-seeking question asked “How likely are you to seek more information about renewable energy requirements?” The behavioral measure asked “Can we send you an email with more information about renewable energy requirements?” Responses to the latter measure were coded as 1 if the respondent entered their email address and 0 otherwise. The two information-seeking measures correlate to some extent (r = 0.44).Footnote 7
Estimation Strategy
Following from Gaines and Kuklinski (Reference Gaines and Kuklinski2011a), I estimate three different effects for each outcome variable. The first is the familiar SATE:
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20171026065732393-0802:S205226301700001X:S205226301700001X_eqn1.gif?pub-status=live)
where
$\bar{Y}_{\text{Pro}}$
is the mean outcome value among those captively assigned to the Pro message and
$\bar{Y}_{\text{Con}}$
is the mean outcome value among those captively assigned to the Con message.
The SATE is a weighted average of effects of the Pro (versus Con) message for different observable subsamples: one for those who would choose the Pro message and one for those who would choose the Con message if given the choice.Footnote 8 These effects are identified by the present design if we are willing to assume (1) that, given random assignment, the choice behavior of those in the choice conditions is on average identical to the unobserved choice behavior of those in the randomized conditions, and (2) the equivalence of potential outcomes for a randomly assigned message versus and the same self-selected message (i.e. that there is no effect of the assignment mechanism; an exclusion restriction). Under these assumptions, we can identify two further effects. One is the effect of the treatment on those who would choose it (the Pro-Selector Effect, or PSE):
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20171026065732393-0802:S205226301700001X:S205226301700001X_eqn2.gif?pub-status=live)
where
$\bar{Y}_{\text{Choice}}$
is the mean outcome value among all respondents assigned to the “choice” condition (see Figure 1) and
$\hat{\alpha}$
is the proportion of these individuals choosing the Pro message.
The final effect captures the difference between receiving the Pro and Con messages among those who would not choose the Pro message (i.e., the Con-Selector Effect, or CSE):
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20171026065732393-0802:S205226301700001X:S205226301700001X_eqn3.gif?pub-status=live)
In other words, the PSE represents the effect of the Pro message treatment versus the Con message for those who would opt for the Pro message if given the opportunity and the CSE represents the effect of the Pro message for those who would opt for the Con message if given the opportunity. Each represents the average treatment effect for distinct subsets of the population. Effect homogeneity occurs when the PSE and CSE are identical, and therefore match the SATE. Heterogeneity occurs when these quantities diverge, making the SATE reflective of only one or neither of the underlying subgroup effects.
RESULTS
I begin by examining the information choices made by respondents in the choice conditions. Consistent with expectations, issue importance increased the degree of opinion-congruent selective exposure: 88% of high-importance respondents chose information congruent with their t1 opinion, while only 63% of low-importance respondents chose in this way (a statistically significant difference).Footnote 9 This means that respondents are not fixed types who would always select in the same way (an assumption in past work). As such, it is reasonable to expect that if there are heterogeneous effects of exposure to the Pro message, the pattern of that heterogeneity is likely to be most clear in the high-importance groups where Pro selectors and Con selectors differ from each other most dramatically.
Table 1 presents the main results separately for the full sample in panel (a), the low-importance condition in panel (b), and the high-importance condition in panel (c). Looking at panel (a), we see SATEs estimated from the captive conditions for each of the five outcome measures (argument evaluation, opinion level, opinion change, planned information-seeking, and requests for an email with issue-relevant information), alongside the corresponding PSE and CSE estimates. In substantive terms, these five SATEs indicate that (1) the Pro argument is seen as more effective than the Con argument, which makes sense given the supportive leanings of the sample, (2) exposure to the Pro message increases t2 policy support, (3) this increase in support holds when measured by t2 − t1 opinion changes, (4) the Pro message insignificantly reduces plans to seek out information, and (5) the Pro message reduces the likelihood of requesting an informational email.
Table 1 Treatment Effect Estimates, by Importance Condition
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20171026065818-95573-mediumThumb-S205226301700001X_tab1.jpg?pub-status=live)
Note: Cell entries are estimated treatment effects, with bootstrapped standard errors in parentheses. The three estimated values of
$\hat{\alpha}$
used in estimating the PSEs and CSEs are 0.69 (full sample), 0.63 (low importance), and 0.75 (high importance).
The PSEs paint a largely similar story, with differences only in effect magnitude. The story is different for CSEs: while the Pro message still increases their policy support over time, there are no other substantively sizable effects and none of the CSEs are statistically distinguishable from zero. In other words, there appears to be effect heterogeneity: those inclined to select the Pro message are affected in various ways, while others are unaffected.
Panel (b) displays results for the low-importance condition. Recall that low-importance diminished attitude-congruent message choice, such that the groups represented by the PSE and CSE are more similar to one another here than in the high-importance condition. The SATEs in this condition are very similar to those for the sample as a whole: the Pro message is seen as more effective, it increases policy support (measured as t2 − t1), and decreases requests for the email, while there is no effect on information seeking. The PSEs (column 2) are consistent with the SATEs in direction but only the effect on email requests is statistically distinguishable from zero. The CSEs (column 3) are also consistent with the SATEs in direction, with the exception of the flipped sign on information seeking, but only the effect on opinion changes is statistically significant. These results point to a pattern of effect homogeneiety, wherein the SATEs provide inferences that apply equally well to those preferring and not preferring the treatment.
Finally, panel (c) shows results for the high-importance condition, where respondents were much more likely to self-select a message congruent with their t1 opinion. The consequence of this for inferences about the effects of the treatment messages should be immediately clear: the SATEs in this condition mirror those for the low-importance condition (except for email requests where there is clearly no effect) and for the sample as a whole, yet the PSEs and CSEs differ considerably.
There is a very large positive PSE on argument evaluation (meaning the Pro message was seen much more favorably than the Con message) and opinions were moved nearly 20% more supportive among Pro-selectors. The PSEs for the two information-seeking measures were negative and not distinguishable from zero. The CSEs (column 3) show something quite different. The effects on argument evaluation and opinion, while not statistically distinguishable from zero given the large standard errors, would imply substantively negative effects. Similarly, the effect on email requests is difficult to distinguish from statistical noise, but points in substantively positive direction.
These results are striking. As expected, the effects vary widely between the high- and low-importance conditions and there is a clear pattern of effect heterogeneity between PSEs and CSEs in the high-importance conditions. If one were to use these data to make inferences about the effects of exposure to a political argument, those inferences could differ substantially depending on the specific effect estimate chosen for each set of the sample. We could infer that the difference between receiving the Pro and Con messages on opinions was any of the following:
-
– Increased policy support (full sample, high-importance, and low-importance SATEs).
-
– Increased support only among those inclined to receive the message (full sample or high-importance PSEs).
-
– No effect on those inclined to receive it (low-importance PSE).
-
– No effect on those disinclined to receive it (full sample and high-importance CSEs).
-
– Possible backfire effect on those disinclined to receive it (high-importance CSE).
If we were interested in the hypothetical, universal application of the Pro rather than Con message, the SATE would tell us that such an intervention would increase policy support. If we were instead interested in potentially distinct PSEs and CSEs, our inference would depend on whether the issue is personally important. Clearly, there is value in knowing all of these effects.
DISCUSSION
Do researchers want to know only what would happen if everyone was exposed to a political message? Or, are they also interested in what effect a message has on individuals who are exposed to it? Arguably it could be both, but we know surprisingly little about the latter given the prevalence of randomized experiments in studies of processes defined by self-selection. The present research shows that the SATE masks substantial effect heterogeneity among those inclined and disinclined to select a given message. While the SATE might lead us to believe that a message increases policy support—thereby implying the desirability of universal message provision—that effect can mask a reality in which those inclined to receive it are affected while those disinclined are not (as was the case here). Or the treatment might have no effect on those who prefer it, while affecting only those who would never choose it of their own accord. The SATE can mislead us about who is affected and how much. If an experiment shows a treatment effect but it actually occurs only for those who would never choose it, what have we learned? The present research shows that taking the leap from SATE to practical implications without acknowledging this heterogeneity is problematic. Yet, uncovering such heterogeneity is made complex because of individuals’ choice behavior. While a self-selection experimental design offers a degree of experimental realism, it is an imperfect remedy for ecological validity concerns. Apparent treatment effects differed here depending on the importance participants attached to the issue at-hand. While the findings suggest incorporating self-selection into a randomized experiment can be fruitful, more research is needed on how best to study self-selection processes.
SUPPLEMENTARY MATERIALS
The appendix is available online as supplementary material at https://doi.org/10.1017/XPS.2017.1