In the last two decades, randomized experiments have become more common in political science (Druckman et al. Reference Druckman, Green, Kuklinski and Lupia2006). Footnote 1 This trend magnifies the importance of clear communication of the statistical results of experiments—and their interpretation as treatment effects—to readers. However, no consensus has emerged among experimentalists about the best format for presenting estimated treatment effects. Footnote 2 This diversity in presentational formats is not surprising given that graphs (e.g., bar graphs or dot plots) and tables are widely perceived to have different strengths as vehicles for displaying findings: graphs better convey general patterns, whereas tables are superior for looking up detailed results (Gelman, Pasarica, and Dodhia Reference Gelman, Pasarica and Dodhia2002; Kastellec and Leoni Reference Kastellec and Leoni2007; Lane and Sándor Reference Lane and Sándor2009). This conventional wisdom suggests that the best way to communicate experimental findings is to present both a graph and a table, thereby avoiding the need for readers to sacrifice either an ability to quickly discern general patterns or access to specific numerical results.
However, manuscript-length limitations imposed by journals create a disincentive for authors to present evidence about a treatment effect in both a graph and a table; indeed, it is rare for published work to present both. Footnote 3 Thus, it is valuable to consider: Is it possible to design a “grable” Footnote 4 that supplements a graph with the numerical results typically displayed in a table but does not take significantly more space than would be required for the graph alone? We believe not only that the answer is “yes,” but that a well-designed grable can convey experimental findings better than a combination of separate graph and table. We are convinced by Sweller et al. (1990; see also Chandler and Sweller Reference Chandler and Sweller1992) that the overriding consideration when presenting information is to minimize a reader’s “cognitive load”—that is, the amount of mental processing required to understand the information. Moreover, there is both strong theory (Gillan et al. Reference Gillan, Wickens, Hollands and Melody Carswell1998; Lane and Sándor Reference Lane and Sándor2009; Wainer Reference Wainer1997, ch. 17) and experimental evidence (Chandler and Sweller Reference Chandler and Sweller1992, 178; see also Sweller et al. Reference Sweller, Chandler, Tierney and Cooper1990) that requiring readers to “split their attention between multiple sources of information” (e.g., a graph and a table) imposes a higher cognitive load than consolidating all information in a single display.
Accordingly, we contend that for many experiments, the best strategy for presenting results is to construct a single figure combining graphics, numbers, and text. This grable would (1) rely on graphics to convey general patterns about treatment effects, (2) display specific values for the key quantities of interest that provide detail about the strength and importance of these treatment effects, and (3) use text labels to help readers interpret these quantities. The key to the success of this strategy is in developing a format for a grable that integrates all of this information without overwhelming readers. We believe such integration is feasible because two features of most experimental research combine to keep the number of quantities of interest relatively low: (1) the number of independent variables observed tends to be small, and (2) each independent variable typically has only a small number of discrete values.
Among experimentalists relying on graphs to convey results, a bar graph that shows the mean value of the dependent variable in each experimental condition is a frequently used format. Footnote 5 A bar graph is a good choice of graphical format because research shows that readers can successfully judge the length of multiple objects (e.g., bars) plotted alongside an axis depicting a linear scale (Gillan et al. Reference Gillan, Wickens, Hollands and Melody Carswell1998; Jacoby and Schneider Reference Jacoby and Schneider2010; Kosslyn Reference Kosslyn2006). Yet, the success of a conventional bar graph in facilitating general-pattern recognition is limited by the fact that its bars focus a reader’s attention on the mean value of the dependent variable in each experimental condition rather than on the actual quantities of interest in an experiment. These quantities nearly always include first differences in means across experimental conditions (which capture the strength of treatment effects). When the theory being tested posits interaction between independent variables, the quantities also typically include “differences between differences” (or second differences) that reflect variation in treatment effects across contexts.
The key to the success of this strategy is in developing a format for a grable that integrates all of this information without overwhelming readers.
We propose supplementing a conventional bar graph—with additional text and graphics—to direct a reader’s attention to these quantities of interest. We claim that for a typical experiment, the resulting enhanced bar graph (1) conveys general patterns in the results better than a standard bar graph; (2) displays all key quantities of interest relevant to testing hypotheses—both point estimates and confidence intervals—in locations that are easy for a reader to find; and (3) clarifies how these quantities can be interpreted as estimated treatment effects. The third feature is especially valuable to readers with limited training in quantitative methods. We recognize that some scholars argue that dot plots are superior to bar graphs for communicating statistical results (e.g., Cleveland Reference Cleveland1984; Jacoby Reference Jacoby2006). Accordingly, we illustrate how a dot plot can be “enhanced” to convey the same information as an enhanced bar graph.
KEY QUANTITIES OF INTEREST FROM AN EXPERIMENT
For most political science experiments, a well-designed table can display all relevant quantities of interest. To illustrate, consider the results of a fictitious 2x2 factorial experiment to test a hypothesis that two variables—Treatment 1 (absent or present) and Treatment 2 (absent or present)—interact in influencing a dependent variable, Y. The hypothesis is that each treatment increases Y regardless of whether the other treatment is present, and each treatment is more effective when the other is present than when the other is absent. Table 1 reports (in the nonshaded cells) the mean value of Y—to be denoted $\bar Y$ —in each of the four experimental conditions (along with the sample size for the condition). The table also displays point estimates of all quantities of interest relevant to testing the hypothesis: (1) in the four lightly shaded cells, first differences in $\bar Y$ reflecting the average effect of each treatment when the other treatment is present as well as when the other treatment is absent; and (2) in the darkly shaded cell, the second difference (or “difference between differences”) in $\bar Y$ , capturing the strength of interaction between Treatment 1 and Treatment 2. Finally, table 1 presents a 95% confidence interval for each estimated quantity.
Note: Each of the four nonshaded cells of the table shows the estimated mean of the dependent variable, Y, among subjects in an experimental condition. Each row or column marginal (i.e., lightly shaded cell) reports a difference in means reflecting the effect of one factor at a value of the other factor. The darkly shaded cell in the lower-right corner contains the difference between two differences in means and reflects the strength of interaction between the two factors in their effect on Y. Each estimated quantity is reported along with the boundaries for a 95% confidence interval in parentheses.
Table 1 provides “look-up” capability because readers can find within the table each of the five quantities of interest (i.e., four first differences and one second difference). However, for a reader not already familiar with the table’s format or not well trained in experimental design, discerning the strength of treatment effects and the extent of interaction from table 1 requires careful inspection. This leads us to consider: Can we design a figure that would allow a reader—even one rarely exposed to experimental research—to easily discern the numeric value of each quantity of interest in table 1, yet also display a graph that makes immediately evident the experiment’s general conclusions about the hypothesized treatment effects?
ENHANCING A CONVENTIONAL BAR GRAPH
Figure 1 portrays the results from table 1 in a conventional bar graph; that is, the graph plots the estimated mean of Y—along with a 95% confidence interval—in each of the four experimental conditions. Figure 1 clearly outperforms table 1 in facilitating pattern recognition. Ignoring the strength of treatment effects and considering only their direction, we can easily compare (1) the lengths of the first and second bars to discern that Treatment 1’s effect is positive in the absence of Treatment 2; and (2) the lengths of the third and fourth bars to recognize that Treatment 1’s effect is positive in the presence of Treatment 2. We also can observe that the effect of Treatment 2 is positive regardless of whether Treatment 1 is present. However, this recognition requires a more demanding task: comparing the lengths of two nonadjacent bars (i.e., the first to the third, and the second to the fourth). Finally, we can see that Treatment 1’s effect is stronger when Treatment 2 is present by recognizing that the difference in the lengths of the right-most two bars (reflecting the effect of Treatment 1 in the presence of Treatment 2) is greater than the difference in the lengths of the left-most two bars (reflecting the effect of Treatment 1 in the absence of Treatment 2). This is clearly the most challenging of the pattern recognitions because it requires simultaneous consideration of the lengths of all four bars. Readers specializing in experimental research are likely to recognize quickly that the pattern of the four bars is indicative of interaction between Treatment 1 and Treatment 2; however, those less familiar with experiments may need to more closely examine the graph to see that the interaction is present.
Thus, there is room to improve a conventional bar graph’s ability to facilitate general-pattern recognition. The bars in a conventional bar graph steer a reader’s attention to the value of $\bar Y$ in each experimental condition. With additional graphics, attention can be directed instead to the quantities of interest relevant to testing a researcher’s hypotheses: the first and second differences in $\bar Y$ . Furthermore, the conventional bar graph in figure 1 depicts confidence intervals only for $\bar Y$ values and provides no information about the uncertainty of the estimated quantities of interest—which is far more important.
We overcome the deficiencies of a conventional bar graph with additional text and graphics to produce an enhanced bar graph. The online appendix contains a detailed description of the features of an enhanced bar graph. This article illustrates these features by supplementing figure 1 with additional information to produce the enhanced bar graph in figure 2. Footnote 6 Figure 2 conveys all relevant information about each of the five quantities of interest necessary to evaluate the underlying hypothesis without requiring readers to alternate between a graph and a table of numerical results. The figure relies on the following several conventions:
-
• U-shaped arrows are included to focus a reader’s attention on the key quantities of interest: first and second differences in means (i.e., in the lengths of bars). Each first difference in means is depicted with a single arrow connecting the two experimental conditions being compared. The second difference is portrayed with a double arrow connecting the two first differences being compared. An arrow is made solid to indicate that a difference is statistically significant and deemed large enough in magnitude to be substantively important. An arrow is dashed to convey that a difference is statistically insignificant or too small to be of practical consequence.
-
• Text labels are used to display each numerical quantity of interest, with the symbol Δ denoting a first difference and the symbol ΔΔ indicating a second difference. We use large, boldface text to indicate quantities that are statistically significant and substantively important; smaller, lightface text is used for quantities that are statistically or substantively insignificant. We also choose a level of precision for each quantity that avoids displaying substantively trivial digits that serve only to distract.
-
• The text label for each quantity of interest provides details relevant to determining both its statistical and practical significance: a point estimate, followed by a 95% confidence interval. However, to avoid unnecessary clutter, we deviate from the conventional bar graph in figure 1 by not displaying a confidence interval for the mean value of the dependent variable in each of the four experimental conditions; these means are not relevant to testing the hypothesis underlying the experiment.
-
• The point estimate of each quantity of interest is preceded by an interpretation of its meaning (e.g., in the upper-right corner of figure 2, “Effect of Treatment 1 in presence of Treatment 2”). Of course, in a well-written paper, such interpretations are thoroughly discussed in the text. However, we believe that incorporating brief descriptions of interpretations of relevant first and second differences can help readers recognize how the interpretations emerge from the statistical results. This feature of an enhanced bar graph is especially valuable for those without strong training in quantitative methods.
In our view, incorporating the arrows and text labels into the bar graph in figure 2 does not detract from a reader’s ability to quickly absorb the general patterns evident by scanning the relative lengths of bars. Moreover, by strategically positioning the text in the enhanced bar graph, we can guide readers to relevant quantities of interest, thereby enhancing their ability to connect these quantities to the general patterns in a way that facilitates understanding of the experimental results. Many experiments in political science have four or fewer observed experimental conditions; for these studies, we believe that an easily readable enhanced bar graph can almost always be constructed. Footnote 7
...by strategically positioning the text in the enhanced bar graph, we can guide readers to relevant quantities of interest, thereby enhancing their ability to connect these quantities to the general patterns in a way that facilitates understanding of the experimental results.
Figure 3 depicts a version of a dot plot supplemented with numerical values and text to convey the same information presented in the enhanced bar graph shown in figure 2. At the top of the enhanced dot plot, there are four dots indicating the mean of Y in each experimental condition. Below these dots are five arrows. Each single arrow represents a first difference in means; the double arrow denotes the second difference. We believe that the principal advantage of the dot plot over the bar graph is that the former displays each quantity of interest using an object—an arrow—with a length equal to the quantity. However, this advantage is lessened by the fact that the arrows are not aligned to start at the same origin, which complicates visual comparison of the magnitude of treatment effects. As a consequence, we believe it is easier for readers to see how each first and second difference is computed from component $\bar Y$ values in the bar graph than in the dot plot. Balancing all considerations, we think an enhanced bar graph is a slightly more effective grable than an enhanced dot plot for conveying experimental findings. Footnote 8
Note: An online appendix and code in both Stata and R illustrating how an enhanced bar graph can be constructed are available at coss.fsu.edu/enhancedbargraph. We encourage researchers who construct enhanced bar graphs to present their experimental results to share their computer code with other scholars. To facilitate this sharing, if researchers e-mail their code along with a pdf image of the graph created to one of us, we will post the files at the website, explicitly recognizing the generosity of the contributor.
SUPPLEMENTARY MATERIAL
To view supplementary material for this article, please visit https://doi.org/10.1017/S1049096517000683
ACKNOWLEDGMENTS
We thank Kevin Dyrland for assistance in creating the graphs portrayed in the article; and Kevin Arceneaux, Jason Barabas, Jamie Druckman, Jens Grosser, Jennifer Jerit, Thomas Leeper, Kevin Mullinix, Megan Shannon, and Mark Souva for helpful comments on previous versions of it. We are also grateful to Vera Mironova and Sam Whitt for sharing replication data.