There is a large and growing body of research focused on the potential for group-delivered interventions for aggressive/disruptive children to result in a peer contagion process, which includes deviance training, and thus can result in iatrogenic program effects (e.g., Dishion & Tipsord, Reference Dishion and Tipsord2011; Dishion & Dodge, Reference Dishion and Dodge2005, Reference Dishion, Dodge, Dodge, Dishion and Lansford2006; Dodge, Dishion, & Lansford, Reference Dodge, Dishion and Lansford2006). Although much of the work to date has focused on group interventions and group dynamics, this peer contagion work has direct applicability to universal behavioral programming in classrooms. For example, it is likely that teachers’ ability to deliver universal program content is especially challenging in classrooms with large numbers of students lacking inhibitory controls. In high-risk school settings, there may be a higher proportion of students with challenging behaviors, whereby the group context could similarly allow for the collective modeling and reinforcement of deviant student behaviors. Also applicable in classrooms is the coercion theory of antisocial behavior, wherein parent inability to effectively manage child noncompliance can trigger a coercive cycle of escalating aversive behavior on the part of the both the parent and the child (Dishion, Nelson, & Kavanagh, Reference Dishion, Nelson and Kavanagh2003; Patterson, Reid, & Dishion, Reference Patterson, Reid and Dishion1992). A similar coercive cycle may develop in classrooms as well, even when teachers attempt to implement evidence-based classroom behavior management programs.
In the current study, we extended Dishion and colleagues expansive line of work on peer deviance training to the elementary school classroom context by examining the association between classroom-level aggressive/disruptive behavior and teachers’ implementation of the PAX Good Behavior Game (GBG; Embry, Staatemeier, Richardson, Lauger, & Mitich, Reference Embry, Staatemeier, Richardson, Lauger and Mitich2003; Ialongo et al., Reference Ialongo, Domitrovich, Embry, Greenberg, Lawson, Becker and Bradshaw2019), which is a classroom-wide intervention designed to reduce aggressive/disruptive behavior. We also examined whether positive and prosocial student behavior was associated with implementation. Although there has been considerable research documenting the potential impact of classroom-based preventive interventions (e.g., Chiapa, Parra Morris, Véronneau, & Dishion, Reference Chiapa, Parra Morris, Véronneau and Dishion2016; Smith et al., Reference Smith, Berkel, Rudo-Stern, Montaño, St George, Prado and Dishion2018), there remains a gap in the consideration of classroom behavioral dynamics in relation to implementation, which in turn might contribute to the overall impact of these programs. This study aims to fill this gap, with the overarching goal of determining the context and conditions under which a commonly used classroom-based prevention intervention (i.e., PAX GBG) is most effective at addressing aggressive/disruptive behavior problems.
Association Between Context and Implementation Fidelity
The field of implementation science continues to broaden, encompassing a plethora of conceptual theories, frameworks, and models (Nilsen, Reference Nilsen2015) defining a myriad of implementation-relevant factors (e.g., Durlak & DuPre, Reference Durlak and DuPre2008; Han & Weiss, Reference Han and Weiss2005) at multiple contextual levels as well as the corresponding supports needed (e.g., Domitrovich et al., Reference Domitrovich, Bradshaw, Poduska, Hoagwood, Buckley, Olin and Ialongo2008). Also recognized is the integral importance of practitioners and implementers to achieve implementation fidelity (e.g., Fixsen, Blase, Naoom, & Wallace, Reference Fixsen, Blase, Naoom and Wallace2009; Lochman, Dishion, Boxmeyer, Powell, & Qu, Reference Lochman, Dishion, Boxmeyer, Powell and Qu2017). Implementation studies have addressed the classroom and school contexts, aligning to the above-noted models and frameworks. However, the emphasis has been on measuring the observed or teacher-reported perceptions of school and classroom climate and teacher demographics, training and education, and other teacher factors such as efficacy and burnout. The research findings regarding the potential influence of these variables are mixed, whereby some are null, positive, or negative.
For example, prior studies of preventive interventions have suggested that factors such as administrator support (Kam, Greenberg, & Walls, Reference Kam, Greenberg and Walls2003; Rohrbach, Graham, & Hansen, Reference Rohrbach, Graham and Hansen1993) and efficacy (e.g., Ringwalt et al., Reference Ringwalt, Ennett, Johnson, Rohrbach, Simons-Rudolph, Vincus and Thorne2003; Rohrbach et al., Reference Rohrbach, Graham and Hansen1993) are important predictors of implementation. Similarly, in one study, teacher burnout was linked with lower implementation dosage (Domitrovich et al., Reference Domitrovich, Pas, Bradshaw, Becker, Keperling, Embry and Ialongo2015); however, other studies have suggested that burnout may moderate the association between coach–teacher alliance and adherence to an intervention protocol (Wehby, Maggin, Moore Partin, & Robertson, Reference Wehby, Maggin, Moore Partin and Robertson2012). The teachers’ ratings of the social validity of the intervention (as measured by the perception of program fit with the teacher's style; Domitrovich et al., Reference Domitrovich, Pas, Bradshaw, Becker, Keperling, Embry and Ialongo2015; and as fit, effectiveness, and reduced burden; Wehby et al., Reference Wehby, Maggin, Moore Partin and Robertson2012) are also associated with higher implementation dosage and adherence. The findings regarding teacher training, education, and demographics are somewhat inconclusive, whereby some studies report null findings for education level and experience (e.g., Domitrovich, Gest, Gill, Jones, & DeRouise, Reference Domitrovich, Gest, Gill, Jones and DeRouise2009; Ringwalt et al., Reference Ringwalt, Ennett, Johnson, Rohrbach, Simons-Rudolph, Vincus and Thorne2003; Wanless, Rimm-Kaufman, Abry, Larsen, & Patton, Reference Wanless, Rimm-Kaufman, Abry, Larsen and Patton2015), and others report a positive association (e.g., Domitrovich et al., Reference Domitrovich, Gest, Gill, Jones and DeRouise2009; Sutherland, Conroy, McLeod, Algina, & Kunemund, Reference Sutherland, Conroy, McLeod, Algina and Kunemund2018).
Another possible contextual influence on implementation is the classroom climate. Much of the prior research has shown a positive association between ratings of positive school climate and implementation (Bradshaw, Koth, Thornton, & Leaf, Reference Bradshaw, Koth, Thornton and Leaf2009; Malloy et al., Reference Malloy, Acock, DuBois, Vuchinich, Silverthorn, Ji and Flay2015; Pas, Waasdorp, & Bradshaw, Reference Pas, Waasdorp and Bradshaw2015); however, some studies suggest a negative relationship over time (Pas, Bradshaw, et al., Reference Pas, Bradshaw, Becker, Domitrovich, Berg, Musci and Ialongo2015; Sutherland, Conroy, McLeod, Algina, & Kunemund, Reference Sutherland, Conroy, McLeod, Algina and Kunemund2018). For example, a study testing a targeted intervention indicated that observed emotional supports within the classroom were positively associated with implementation competence and adherence over time, while observed classroom organization was negatively associated with growth in adherence over time (Sutherland, Conroy, McLeod, Algina, & Kunemund, Reference Sutherland, Conroy, McLeod, Algina and Kunemund2018). The classroom organization measure in the Sutherland et al. study focused on the teachers’ behavioral and instructional management strategies. They posited that in the context of such foundational supports, there may have been less room or need for growth in adherence to this intervention. Taken together, extant mixed findings suggest a need for additional exploration of both teacher and classroom contextual influences on implementation (Domitrovich et al., Reference Domitrovich, Gest, Gill, Jones and DeRouise2009).
Student Behavior as a Contextual Factor
Few studies have considered the extent to which students’ behavior in the classroom prior to program implementation may relate to teacher implementation fidelity. Disorder or dysfunction measured more broadly within other, nonclassroom contexts has been shown to be associated with fidelity. A national survey of school-based interventions found that large urban schools with large proportions of poor students reported implementing more programs (Payne, Gottfredson, & Gottfredson, Reference Payne, Gottfredson and Gottfredson2006). Further, in studies of family/home interventions, analogous variables such as familial stress levels and current mental health problems have been found to be related to higher family engagement and thus dosage received in one widely used home visiting program but not another (e.g., Latimore et al., Reference Latimore, Burrell, Crowne, Ojo, Cluxton-Keller, Gustin and Duggan2017). These findings suggest that implementers serving environments in greater need of prevention programs may be more motivated to implement programs and, in some cases, with higher fidelity. In contrast, research regarding the implementation of positive behavior supports has demonstrated that schools with a higher suspension rate and poorer organizational health have lower implementation (Bradshaw, Mitchell, O'Brennan, & Leaf, Reference Bradshaw, Mitchell, O'Brennan and Leaf2010; Pas, Bradshaw, et al., Reference Pas, Bradshaw, Becker, Domitrovich, Berg, Musci and Ialongo2015). Classrooms with greater problem behaviors may similarly suffer from poorer implementation.
Another Sutherland et al. study (Sutherland, Conroy, McLeod, Algina, & Wu, Reference Sutherland, Conroy, McLeod, Algina and Wu2018) recently examined ratings of student “problem” and “externalizing” behavior in relation to adherence and quality for the implementation of BEST in CLASS; their analyses suggested a positive and significant association between only problem behaviors and implementation quality. It is important to explore, however, whether these findings generalize to universal prevention programs, as compared to the targeted intervention tested by both Sutherland et al. studies. Exploration of a broader set of both positive and negative student behaviors at baseline may also better inform our understanding of the extent to which student behavior predicts implementation dosage of universal programs. Positive student behavior, as compared to problematic student behavior, has been largely overlooked in the literature on possible classroom-level predictors of implementation.
Overview of the Current Study
In the present study, we simultaneously explored two research questions regarding the direction of the potential effects of collective classroom-level student behaviors and teachers beliefs and perceptions on teacher implementation of the PAX GBG. The first research question centered on the effects of collective classroom-level behaviors, both positive and negative, on implementation, while the second was focused on the role of teacher beliefs and perceptions. With regard to aggressive/disruptive behavior, consistent with Patterson et al.’s (Reference Patterson, Reid and Dishion1992) coercion theory of antisocial behavior, it was expected that higher levels of classroom aggressive/disruptive behavior would be associated with lower levels of teacher implementation of the PAX GBG. A second and competing hypothesis was that higher classroom levels of aggressive/disruptive behavior may motivate teachers to implement the PAX GBG more often in order to gain better control of student behavior. Further, we reasoned that teachers in classrooms with a high baseline level of student prosocial behaviors may not feel the urgency to implement the PAX GBG at a high dosage, particularly if they feel their students already exhibit the competencies targeted by the program. Alternatively, teachers in classrooms with fewer behavioral issues may have more flexible class time to dedicate to implementing the PAX GBG and, thus, may do so more frequently.
We operationalized implementation dosage in the present study as the number of PAX GBG “games” played across the school year and accounted for student gender as a correlate of individual student behavior. We also included teacher demographics and teacher perceptions and beliefs as correlates of teacher implementation at the classroom level, as prior research has suggested they may be predictive of implementation of PAX GBG (i.e., Domitrovich et al., Reference Domitrovich, Pas, Bradshaw, Becker, Keperling, Embry and Ialongo2015).
In order to address some of the methodological challenges associated with modeling student behaviors nested within classrooms, we used multilevel structural equation modeling (ML-SEM). This flexible modeling approach allows classroom-level variability in student behaviors to impact implementation at the teacher level. Previous research utilizing ML-SEM has demonstrated the ability to model and test contextual effects using individual-level data (Dunn, Masyn, Johnston, & Subramanian, Reference Dunn, Masyn and Johnston2015) and the ability to test differences in the factor structure of a construct at multiple levels (Huang & Cornell, Reference Huang and Cornell2016). This approach enabled us to examine between-classroom variation in collective student behaviors, while also adjusting for teacher-level characteristics that had been previously linked with implementation (e.g., demographics, burnout, and organizational health). Specifically, modeling classroom-level random effects of the student-level behaviors act here as latent predictors of the classroom-level outcome of program implementation by the teacher. Multilevel latent variable models, for example, multilevel factor analysis, multilevel path analysis, and ML-SEM, are being used more frequently in multilevel studies as reflected in a recent review of the literature that described 27 such applications across multiple disciplines (Kim, Dedrick, Cao, & Ferron, Reference Kim, Dedrick, Cao and Ferron2016). However, none of these studies utilized this ML-SEM to explore the association between student-level behaviors on classroom and teacher-level measures, such as implementation. This is a novel modeling approach and uniquely suited for examining our research questions, as it allows for appropriate disentanglement of within- and between-classroom variability in student behavior.
Method
Design overview
The data for this study were collected as part of a randomized controlled trial (Ialongo et al., Reference Ialongo, Domitrovich, Embry, Greenberg, Lawson, Becker and Bradshaw2019) of the integration of the PAX GBG with the Promoting Alternative THinking Strategies (i.e., PATHS; Greenberg, Kusché, & Conduct Problems Prevention Research Group, Reference Greenberg and Kusché2011; Kusché, Greenberg, & Conduct Problems Prevention Research Group, Reference Kusché and Greenberg2011) social-emotional learning curriculum. Participating schools were randomized to one of three conditions: the integrated model (9 schools), the PAX GBG only (9 schools), and a control condition (9 schools). Random assignment to one of the three intervention arms (control, PAX GBG only, or integrated PATHS to PAX GBG) occurred at the school level, whereby all classrooms within the school were assigned the same condition; therefore, the subsample of the 155 classrooms within the 18 intervention schools was included in the current study. Given our focus on implementation, the 9 control schools were excluded from the current analyses. Data regarding PAX GBG implementation were collected in each of the 18 intervention schools.
Participants
The current study sample included the 3,115 students who were in the 204 intervention K–5 classrooms across the 18 intervention schools; of the 204 teachers, and thus classrooms, about 60% were in schools randomly assigned to deliver PAX GBG, whereas about 40% were in schools randomized to deliver the integrated PAX GBG and PATHS curriculum. Students were predominantly African American (87.5%) and 49.1% were male. This is representative of the broader district population, which was a large, urban, East Coast public school district with a largely African American student population (88%, on average). The vast majority of sample teachers were female (i.e., 91.0%) and slightly less than half were 30 or younger and taught students in Grades 3 through 5.
Intervention
Teachers in both intervention conditions were trained to implement the PAX GBG (i.e., either as a stand-alone intervention or as integrated with PATHS). The GBG was originally developed by Barrish, Saunders, and Wolf (Reference Barrish, Saunders and Wolf1969) and applies social learning principles to a team-based game that classroom teachers implement to promote “good behavior” and reduce aggressive, disruptive, and off-task behavior. In a more recent update and augmentation, Embry et al. (Reference Embry, Staatemeier, Richardson, Lauger and Mitich2003) added verbal and visual cues to the “PAX GBG” version to promote generalization of increased attention and prosocial behaviors throughout the school day, and not just when playing the GBG. Various versions of the GBG have been tested over the past two decades, and most have demonstrated positive effects on academic, behavioral, and substance use outcomes (see Ialongo et al., Reference Ialongo, Werthamer, Kellam, Brown, Wang and Lin1999, Reference Ialongo, Domitrovich, Embry, Greenberg, Lawson, Becker and Bradshaw2019).
Procedure
Recruitment for the intervention study occurred at the school level; all principals agreed to participate in the year-long project and allow their teachers to receive training and coaching in the interventions. Teacher participation was voluntary; teachers were recruited to participate by project staff and provided written consent to report on their beliefs and perceptions about teaching and workplace stress level. Parents of student participants provided active written consent for participation in the data collection (Ialongo et al., Reference Ialongo, Domitrovich, Embry, Greenberg, Lawson, Becker and Bradshaw2019). The current study utilizes secondary data analysis among intervention classrooms only; the original randomized controlled trial, along with the current study, were approved by study's university institutional review board.
Training for the interventions included a multiday workshop at the start of the school year (i.e., 1.5 days for the PAX GBG only condition and 3.5 days of training for the integrated condition, of which the same 1.5 days was focused on PAX GBG; Domitrovich et al., Reference Domitrovich, Bradshaw, Greenberg, Embry, Poduska and Ialongo2010). Across the entire school year, teachers also received 31 weeks of weekly face-to-face coaching, which was a manualized coaching approach that included tailoring to teacher needs (see Becker, Bradshaw, et al., Reference Becker, Bradshaw, Domitrovich and Ialongo2013; Becker, Darney, et al., Reference Becker, Darney, Domitrovich, Keperling and Ialongo2013, for additional details on the coaching model). Prior research demonstrated that teachers in the integrated condition had more contacts with the coach over the course of the school year (Pas, Bradshaw, et al., Reference Pas, Bradshaw, Becker, Domitrovich, Berg, Musci and Ialongo2015), but that teachers in both conditions implemented a statistically equivalent number of games (Domitrovich et al., Reference Domitrovich, Pas, Bradshaw, Becker, Keperling, Embry and Ialongo2015).
Measures
Student measures
At baseline (fall), teachers provided ratings of each student's classroom behaviors over the last 3 weeks using the Teacher Observation of Classroom Adaptation—Revised (TOCA-R; Werthamer-Larsson, Kellam, & Wheeler, Reference Werthamer-Larsson, Kellam and Wheeler1991). The TOCA-R includes items on a range of subscales, using a 6-point Likert scale related to behavior frequency (1= almost never to 6 = almost always) and has been used in several prevention trials (e.g., Ialongo et al., Reference Ialongo, Werthamer, Kellam, Brown, Wang and Lin1999; Petras, Chilcoat, Leaf, Ialongo, & Kellam, Reference Petras, Chilcoat, Leaf, Ialongo and Kellam2004; Werthamer-Larsson et al., Reference Werthamer-Larsson, Kellam and Wheeler1991). Included in this study were the subscales regarding students’ aggressive behavior (15 items; e.g., lied, started physical fights, stubborn, broke rules, hurt others physically, and yelled at others; α = .96), inattention/hyperactivity (6 items; e.g., paid attention, stays on task, and concentrates on class work; α = .87), academic engagement (3 items; i.e., completed assignments, learned up to ability, and eager to learn; α = .89), and positive peer relations (3 items; i.e., liked by classmates, other children sought him/her out to play, and disliked by classmates; α = .83). Two other teacher rating subscales came from the Fast Track Social Health Profile (Conduct Problems Prevention Research Group, 1999): social competence (8 items; e.g., resolves peer problems on his/her own, expresses feeling appropriately, and showed empathy and compassion for others feelings; α = .94) and emotion regulation (4 items; e.g., controlled temper when there was a disagreement, could calm down when excited or all wound up, and coped well with disappointment or frustration; α = .88). Items for each subscale were average together to create subscale scores for each student, ranging from 1 to 6. Higher scores indicated higher frequencies, on average, for the behaviors captured by each subscale. These subscale scores have been used extensively and there is considerable data documenting their reliability and validity (e.g., Bradshaw & Kush, Reference Bradshaw and Kushin press; Ialongo et al., Reference Ialongo, Domitrovich, Embry, Greenberg, Lawson, Becker and Bradshaw2019; Koth, Bradshaw, & Leaf, Reference Koth, Bradshaw and Leaf2009; Werthamer-Larsson et al., Reference Werthamer-Larsson, Kellam and Wheeler1991).
Teacher beliefs and perceptions
At baseline, teachers completed a self-report survey regarding their beliefs and perceptions at four time points across the school year. The baseline data collection for three of the included scales were included in these analyses. Specifically, we assessed teacher efficacy, burnout, and perceptions of the school organizational climate. The Behavior Management Self-Efficacy Scale (Main & Hammond, Reference Main and Hammond2008) is a 14-item scale that included items about classroom behavior management such as “I am able to use a variety of behavior management techniques” (α = .93). Item response options were based on a 5-point Likert scale, ranging from not at all to a very great extent, and were averaged, with higher scores indicating greater efficacy. We also included the emotional exhaustion scale of the Maslach Burnout Inventory (Maslach, Jackson, & Leiter Reference Maslach, Jackson and Leiter1996). This included 9 items, such as “I feel used up at the end of the workday” (α = .91). Responses were rated on a 7-point scale from never to every day and averaged to create scale scores such that higher scores indicated greater emotional exhaustion. Finally, teachers completed the 37-item Organizational Health Inventory for Elementary Schools (OHI; Hoy & Feldman, Reference Hoy and Feldman1987). Factor analysis has confirmed the following five-factor structure of the OHI (Hoy & Tarter, Reference Hoy and Tarter1997): teacher affiliation, academic emphasis, collegial leadership, resource influence, and institutional integrity. The average of all 37 items was calculated to create the OHI scale score (α = .93). Item responses were rated on a 4-point Likert scale ranging from strongly disagree to strongly agree. A higher score indicated greater organizational health. These scales were included given their prior use in studies and findings that they are associated with GBG implementation (Domitrovich et al., Reference Domitrovich, Pas, Bradshaw, Becker, Keperling, Embry and Ialongo2015).
Teacher demographics
Teachers additionally completed a self-report informational form at baseline, providing their own demographic data (i.e., gender and age) and teaching experience data (e.g., grade level taught). Both teacher age and grade taught were included as control variables. Gender was not included because of the high proportion of women in the sample.
PAX GBG implementation dosage
Teachers completed and submitted a weekly log of the number of PAX GBG games played and the duration of each game to the coaches, to document their PAX GBG dosage. These data were summed across the 31-week implementation period. Although not perfectly correlated, there was a high degree of overlap between the two totals (number of games and number of minutes), and we elected to use the total number of games as our outcome measure. The total count of games ranged in the sample from 1 to 433 with a diffuse distribution across the range, so we considered it quite reasonable to treat the outcome as continuous. We transformed the total counts into z scores (ZGP) to ease interpretation and to avoid an ill-conditioned Level 2 covariance matrix as the range and variance of the totals on the original scale were several orders of magnitude larger (104) than the Level 2 predictors.
Overview of the analyses
To examine whether teacher ratings of student behavior were significantly associated with teacher implementation levels of the GBG, we conducted a series of multilevel analyses on the TOCA-R student behavior data in Mplus, Version 8.2 (Muthén & Muthén, Reference Muthén and Muthén1997–2019). At Level 1, each student-level behavior score for student i in classroom j was specified as the sum of a classroom-level (latent) mean (i.e., random intercept), μj, plus the student's (latent) deviation from his or her classroom mean, ε ij; that is,
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20200701152702456-0364:S095457941900097X:S095457941900097X_eqnU1.gif?pub-status=live)
The random errors at Level 1 for the six behavior subscales (academic achievement, social competency, positive peer relations, emotion regulation, hyperactivity, and aggression) were allowed to freely covary with each other. At Level 2, the outcome was specified as a linear combination of the (latent) classroom-level means for each behavior plus a random disturbance term; that is,
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20200701152702456-0364:S095457941900097X:S095457941900097X_eqnU2.gif?pub-status=live)
The random means at Level 2 for the six behaviors were allowed to freely covary with each other. Figure 1 provides a path diagram representation of the unadjusted, Level 2 model.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20200701152702456-0364:S095457941900097X:S095457941900097X_fig1g.gif?pub-status=live)
Figure 1. Path diagram of the unadjusted multilevel structural equation model predicting the teacher-reported number of games played by the latent classroom-level means of six student behavior subscales.
After estimating the unadjusted model, we added student gender (grand-mean centered) as a covariate for all six behavior subscales on the Level 1 model, changing the classroom-level random intercepts, μj, to gender-adjusted latent means. We then added teacher-related covariates at Level 2 as predictors of the number of GBGs played and correlates of all six latent classroom-level adjusted behavior subscales mean. We included the behavior management self-efficacy scale, the emotional exhaustion subscale of the burnout inventory, the organizational health scale, teacher age (dichotomized to indicate teachers under the age of 30), and teacher grade level (dichotomized to indicate 3rd- to 5th-grade teachers versus kindergarten- to 2nd-grade teachers), some of which have been shown to relate to implementation in prior research (e.g., Domitrovich et al., Reference Domitrovich, Pas, Bradshaw, Becker, Keperling, Embry and Ialongo2015). Intervention condition (i.e., PAX GBG vs. PATHs integrated with PAX GBG) was not included, as prior research has indicated that condition was not significantly related to dosage (Domitrovich et al., Reference Domitrovich, Bradshaw, Berg, Pas, Becker, Musci and Ialongo2016).
Missing data
Missing student behavior (i.e., via the TOCA-R) data were addressed through use of full information maximum likelihood estimation. Chi-square analyses revealed no significant differences between the presence of data regarding students based on their gender (χ2 = 1.246, p = .266).
Results
Descriptive statistics
Descriptive statistics for the study variables are reported in Tables 1 and 2. There was substantial variation in the number of games played with a mean of 153 games and SD = 100. Overall, the subscales of student behavior were highly correlated with one another in the expected directions at both the within- and between-classroom levels. Similarly, the teacher-reported measures of beliefs and perceptions about teaching were correlated with one another in the expected directions.
Table 1. Descriptive statistics for student-level (Level 1) variables (n = 3,115)
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20200701152702456-0364:S095457941900097X:S095457941900097X_tab1.gif?pub-status=live)
Note: Columns 1–6 are correlations; lower diagonal values are the maximum-likelihood estimated Level 1 (within-classroom) correlations and upper diagonal values are the Level 2 (between-classroom) correlations.
Table 2. Descriptive statistics for teacher/classroom-level (Level 2) variables (N = 204)
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20200701152702456-0364:S095457941900097X:S095457941900097X_tab2.gif?pub-status=live)
Note: Columns 1–5 are correlations.
Implementation prediction
The results of the final two-level SEM model are presented in Table 3. These are based on the Level 2 regression of the z score of the total number of GBGs played on the classroom-level latent means of the student behavior subscales, adjusted for gender on the student level (Level 1) and for teacher-related covariates at Level 2. Of the six behavior subscales, the only one statistically significantly related to the outcome was classroom-level aggression. A positive 1 point difference of mean classroom aggression (1–6 scale) corresponded, on average, to nearly a full SD less of games played (Est. = –0.89 [–1.65, –0.12], SE = 0.39, p = .02, Std. Est. = –0.33), controlling for all other classroom-level mean behaviors and teacher-related covariates. Of the teacher-related covariates, only the emotional exhaustion subscale was significantly related to the outcome. A positive 1 point difference in emotional exhaustion corresponded, on average, to a difference of –0.13 SD of games played (Est. = –0.13 [–0.24, –0.03], SE = 0.05, p = .02, Std. Est. = –0.18), controlling for all other teacher-related covariates and classroom-level mean behaviors. The standardized effect estimate of classroom-level aggression was almost double the standardized effect of emotional exhaustion. There were no variables in the Level 2 model with a significant positive association with the outcome.
Table 3. Final ML-SEM Level 2 regression model results for the outcome Z score of number of games played
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20200701152702456-0364:S095457941900097X:S095457941900097X_tab3.gif?pub-status=live)
Teacher-rated student behaviors and teacher self-reported beliefs and perceptions
To understand the role of teacher self-reported beliefs and perceptions in their ratings of student behavior, teacher-related covariates at Level 2 were allowed to freely covary with the latent classroom-level means of the student behaviors subscale scores. The model-estimated correlations are presented in Table 4. Scores for the teacher-reported behavior management self-efficacy scale had a statistically significant positive association with classroom-level means of teacher-rated student academic engagement, social competence, peer relations, and emotion regulation and had a statistically significant negative association with hyperactivity and aggression. A similar pattern was observed for the organizational health scale. Emotional exhaustion scores were statistically significantly and negatively associated with academic engagement and positively associated with aggression.
Table 4. Model-estimated correlations between teacher-level covariates and classroom-level latent behavior subscale means
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20200701152702456-0364:S095457941900097X:S095457941900097X_tab4.gif?pub-status=live)
Note: Bold represents p < .05.
Discussion
In this study, we sought to build on and extend Dishion and colleagues seminal work on peer contagion and deviance training in group-delivered interventions into the elementary school classroom context (see Dishion & Dodge, Reference Dishion and Dodge2005, Reference Dishion, Dodge, Dodge, Dishion and Lansford2006; Dodge et al., Reference Dodge, Dishion and Lansford2006; Lochman et al., Reference Lochman, Dishion, Powell, Boxmeyer, Qu and Salle2015, Reference Lochman, Dishion, Boxmeyer, Powell and Qu2017). More specifically, we examined whether classroom-level student behaviors, both positive and negative, were associated with teachers’ implementation of an evidence-based, classroom-wide intervention designed to reduce aggressive/disruptive behavior—the PAX Good Behavior Game (PAX GBG; Embry et al., Reference Embry, Staatemeier, Richardson, Lauger and Mitich2003; Ialongo et al., Reference Ialongo, Domitrovich, Embry, Greenberg, Lawson, Becker and Bradshaw2019). We were particularly interested in whether student behavior served as a possible barrier to or as a motivator for high-dosage implementation of the program. In the current study, we leveraged a novel and sophisticated analytic approach, whereby student behavioral ratings by teachers were not simply averaged but latently aggregated from the student level to the classroom level.
Our findings suggested that aggressive/disruptive behavior may have served as the primary barrier for high-dosage implementation, as teachers reporting a 1-point higher level of classroom aggression at baseline, subsequently, played a full standard deviation fewer games across the school year. Although we cannot draw a casual inference based on this association, it is possible that the aggressive/disruptive behavior may make it more challenging for teachers to manage the added burden of implementing the program at a high dosage. This finding is consistent with research on peer contagion in group-delivered interventions (Lochman et al., Reference Lochman, Dishion, Powell, Boxmeyer, Qu and Salle2015, Reference Lochman, Dishion, Boxmeyer, Powell and Qu2017) and prior work on positive behavioral interventions and supports (Bradshaw, Mitchell, & Leaf, Reference Bradshaw, Koth, Thornton and Leaf2009; Pas, Waasdorp, et al., Reference Pas, Waasdorp and Bradshaw2015). Specifically, the program may be perceived as placing an additional demand on the teachers, who are presumably already harried by the high level of problem behavior occurring in the classroom. As such, teachers may struggle to find sufficient time for high implementation dosage. In contrast, other baseline behaviors (i.e., a range of positive student behavior and hyperactivity) was not associated with implementation dosage. Although we cannot draw causal conclusions based on this null finding, it does suggest that teachers felt neither motivated nor burdened by the students’ baseline positive behaviors when it came to program implementation dosage.
This is among the first studies to examine how positive student behavior relates to implementation of a classroom-based, evidence-based intervention. For example, while student behavior was assessed in a recent study by Sutherland, Conroy, McLeod, Algina, and Wu (Reference Sutherland, Conroy, McLeod, Algina and Wu2018), positive behaviors were excluded. Although one may view negative and positive student behaviors as being two sides of the same coin, the findings from this study suggest that they are differentially associated with dosage. Further, it is possible that hyperactivity is a less challenging behavior than aggressive and disruptive behaviors to teachers, indicating low-level negative behaviors may not be a barrier to implementation. This could indicate there is a “tipping” point at which negative behavior presents a challenge to implementation and should be examined in more depth. Additional research is needed to determine whether there are specific conditions under which implementation dosage can be promoted by student behavior specifically. Further, the extent to which the association between baseline student problem behaviors and reduced dosage serves as a pathway to reduced student outcomes is also an area of future research.
The current study also builds upon earlier research by informing our understanding of the extent to which the relationship between teacher burnout/emotional exhaustion and dosage (Domitrovich et al., Reference Domitrovich, Pas, Bradshaw, Becker, Keperling, Embry and Ialongo2015) would still hold, while accounting for student behavior. Such inclusion of student behavior has been absent in the prior GBG implementation literature (i.e., Domitrovich et al., Reference Domitrovich, Pas, Bradshaw, Becker, Keperling, Embry and Ialongo2015; Wehby et al., Reference Wehby, Maggin, Moore Partin and Robertson2012). Although not a central focus of this study, the fact that the associations between burnout and dosage remained significant with this additional level of adjustment in the models provided further evidence of the robustness of these associations, which have largely been overlooked in prior research.
Another area of innovation in this study was the use of ML-SEM in order to model the multilevel design of the study. This analytic methodology could be useful for prevention and intervention researchers interested in understanding how individual factors impact implementation of programs at a higher level (e.g., Level 2, classroom). It allows the researcher to maintain the nuanced and complex nature of a Level 1 (e.g., student) data point, rather than eliminating variability through aggregation. In the absence of conducting a multilevel model, the associations between classroom behavior and intervention implementation in our study may not have been found.
Limitations
Despite several strengths of this study, it is important to consider a number of limitations. For example, this particular study relied on one method of data collection: teacher self-report. Teachers provided ratings of the students on the TOCA-R, and prior research suggests that teacher beliefs and perceptions (e.g., organizational factors and interpersonal factors) may actually influence teachers’ ratings on the TOCA-R (e.g., see Pas & Bradshaw, Reference Pas and Bradshaw2014). This study similarly shows correlations between such teacher beliefs and the student ratings. Teachers also self-reported the number of games played on a weekly basis, as it was not feasible to track the implementation dosage data through another method. Furthermore, prior studies from this trial suggest relatively limited variability in implementation quality of the PAX GBG games, and thus we focused exclusively on dosage, which demonstrated greater variability across the sample (Domitrovich et al., Reference Domitrovich, Pas, Bradshaw, Becker, Keperling, Embry and Ialongo2015, Reference Domitrovich, Bradshaw, Berg, Pas, Becker, Musci and Ialongo2016). Additional variables, such as teacher and student buy-in, may also be of value to explore.
Another limitation is the noncausal nature of this study. For example, teachers are not randomly assigned students with varying levels of student behavior problems. Thus, we cannot draw causal conclusions regarding the directionality of associations between student aggressive and disruptive behavior and teacher implementation dosage or the extent to which it is mediated by factors, such as perceived burden, stress, motivation, or limited time to implement the program. Finally, there are limitations that stem from the two-intervention design of the study. Random assignment to condition occurred at the school level, and there were two conditions (PAX GBG only vs. integrated PATHs to PAX). Although sensitivity analyses suggested no differences in the implementation dosage outcome across these two conditions (p =.81), the lack of power to contrast these findings at a school level represents an additional potential limitation. Owing to the design of the study, data on PATHS implementation was only available for the integrated condition, and as such, there were too few classrooms to estimate classroom-level effects on PATHS implementation dosage.
Conclusions and implications
Taken together, the findings provide compelling evidence that baseline student aggressive/disruptive behaviors may serve as a potential barrier, rather than a motivator, for implementation of the PAX GBG. Whereas students in higher aggressive classrooms on average may have the most potential to actually benefit from preventive interventions, like the PAX GBG, it appears that high dosage was not achieved. It is possible that additional coaching support, beyond what was already provided, was needed for teachers working in these settings, (e.g., more regular, daily check-ins). As such, these classroom-level contextual factors may serve as an important tailoring variable to consider in future implementations of this and other behavior management programs.
Financial Support
The research reported here was supported by the Institute of Education Sciences, US Department of Education, through Grant R305A080326 (PI: N. Ialongo) to the Johns Hopkins University and R305A130701 (PI: C. Bradshaw) to the University of Virginia. The opinions expressed are those of the authors and do not represent views of the Institute or the US Department of Education.
Conflicts of Interest
There are no conflicts of interest to declare.
Ethical Standards
The institutional review boards at the researchers’ institutions approved this study, and the procedures are consistent with the 1964 Helsinki declaration and its later amendments or comparable ethical standards. Informed consent was obtained from all adult participants in the study.
Acknowledgment
The authors would like to thank Celene Domitrovich for her contributions to the project.