Teachers and school personnel struggle to meet the behavioral, social, and emotional needs of young children and students (hereafter referred to as students) who exhibit problem behaviors characteristic of or leading to a later diagnosis of emotional/behavioral disorders (EBDs) in schools (Gilmour et al., 2022). While evidence-based programs exist that teachers and other school personnel can deliver to support students’ behavioral, social, and emotional development, a “voltage drop” (Chambers et al., 2013) in implementation fidelity is expected as interventions move through the translational pipeline (Bradshaw et al., 2012). That is, as interventions move from more tightly controlled efficacy studies into more sustained use by teachers, “voltage drop” may be characterized as a reduction in fidelity to the treatment protocol, yet also characterized by teacher adaptations that attempt to provide greater contextual fit of the intervention to the particular classroom or school context (McLeod et al., 2022). While expected, reductions in treatment fidelity are also associated with poorer student outcomes (Durlak & DuPre, 2008), highlighting the need for researchers to understand better other factors that might explain how interventions impact student outcomes. One understudied area of treatment fidelity is student responsiveness to the intervention when it is delivered. In this paper, we first define the role of student responsiveness within intervention effectiveness via transactional processes and discuss why student responsiveness is essential to consider and measure in studies that include students who exhibit problem behaviors characteristic of EBDs. We then situate student responsiveness within treatment fidelity models to set the stage for the current study.

Transactional Processes, Student Responsiveness, and Students with and At Risk for EBD

Sameroff and Mackenzie (2003) highlight the importance of transactional processes for children’s development and learning, noting that an individual’s developmental and learning processes are impacted by that individual’s interaction with their learning context or environment. Thus, these researchers point out that a focus on the characteristics of, in our case, the student, may not provide a full understanding of their learning and development, noting that “the development of the child is a product of the continuous dynamic interactions of the child and the experience provided by his or her family and social context” (p. 614). The notion that student and teacher behavior influences each other reciprocally across time is essential to applying the transactional model to classroom-based intervention research (Sutherland & Oswald, 2005), which ultimately impacts student learning and development.

To illustrate, given the problem behaviors and learning difficulties characteristic of students with and at risk for EBD, Sutherland and Oswald (2005) encouraged researchers to approach research using transactional theory to understand better interactions between teachers and students and how these interactions relate to subsequent student outcomes. These researchers highlighted that transactional processes between teacher and student behavior might impede students’ with and at risk for EBD exposure to evidence-based instruction or be leveraged to support students’ learning and development. Sutherland et al. (2022) further note that student responsiveness to classroom-based interventions that target students’ behavioral, social, or emotional outcomes should be assessed since behaviors consistent with EBDs (e.g., off-task, disruptive problem behaviors, defiance) may interfere with intervention attempts. For example, suppose a teacher attempts to provide corrective feedback to a student (i.e., explaining the error while also providing information about the appropriate behavior to help the student respond correctly in the future), and the student ignores the teacher’s attempts. In that case, the student is not likely to benefit from this practice to the degree they would if they were responsive. Thus, within the transactional model student responsiveness may be essential to assess to understand if this construct influences intervention effectiveness.

Student Responsiveness as a Dimension of Treatment Fidelity

Conceptual models of treatment fidelity (also referred to as treatment integrity or implementation fidelity) typically describe treatment fidelity as multidimensional (Berkel et al., 2011; Sanetti et al., 2021; Sutherland et al., 2022). Components of treatment fidelity most frequently assessed in the school-based literature include adherence (i.e., the extent to which an intervention was delivered as designed), quality/competence (i.e., how well an intervention was delivered), and dosage (i.e., how much of an intervention was delivered).

Dane and Schneider (1998) were one of the first to define participant responsiveness as “levels of participation and enthusiasm” (p. 45). Depending upon the modality of the intervention, Berkel et al. (2011) defined participant responsiveness via several indicators, including the number of intervention sessions attended, active participation in the intervention, satisfaction with the intervention, and home practice completion. Goncy et al. (2015) defined student responsiveness via two observational items, engagement and following rules. More recently, Sutherland and colleagues (2022) described student responsiveness as “how responsive a student(s) is to an interventionist’s (e.g., teacher or mental health professional) attempt to deliver a practice or intervention” (p. 2). Although student engagement is one of the behaviors comprising responsiveness, the conceptualization of responsiveness for this study expands on a student’s engagement in an activity and includes the level of responsiveness to a teacher’s attempts to provide an intervention directly to the student.

While some researchers have collected student responsiveness data within a treatment fidelity framework (e.g., Combs et al., 2022; Goncy et al., 2015), student responsiveness is understudied in the education intervention literature. Only 14% of studies published in four school psychology journals between 2009 and 2016 reported student responsiveness data (Sanetti et al., 2020). Of the existing research on student responsiveness, researchers have examined student responsiveness as a dependent variable. For example, Goncy et al. (2015) examined student responsiveness to teachers’ delivery of the classroom meetings of the Olweus Bullying Prevention Program (Olweus & Limber, 2007). These researchers found that teachers’ competence in delivering lesson content was significantly related to students’ responsiveness during classroom meetings.

A few studies have also highlighted that students’ responsiveness to interventions is associated with better student outcomes (Low et al., 2014; Ringwalt et al., 2009). For example, Humphrey et al. (2018) conducted a factor analysis of observational fidelity data during a randomized trial of the Promoting Alternative Thinking Strategies (PATHS) curriculum, identifying two dimensions: (a) quality and participant responsiveness and (b) procedural fidelity. Interestingly, the quality and participant responsiveness factor was associated with significantly lower ratings of students’ externalizing behavior at 12-month follow-up, while procedural fidelity was not. Similarly, Low et al. (2014) found that higher levels of student engagement with the Steps to Respect bullying prevention program were related to lower bullying problems, improved school climate, and attitudes less supportive of bullying behavior. In a large-scale study of the Botvin LifeSkills Training program, Combs et al. (2022) used teacher reports to examine several factors on dimensions of treatment fidelity, including student responsiveness. Findings indicated that three classroom factors were inversely related to student responsiveness: problem behavior, modifications to the intervention, and a shortage of time to deliver lessons. Together, this research suggests that measuring student responsiveness may be an essential dimension of treatment fidelity that could represent a change mechanism that could help explain treatment effects.

Current Study

The current study seeks to understand the role student responsiveness plays in interventions delivered for students who exhibit problems behaviors that place them at risk for EBD. Multi-tiered systems of support, including Positive Behavior Interventions and Supports (PBIS; Sugai et al., 2000), typically adopt a response to intervention approach whereby schools attempt to differentiate between students who have a disability versus those who may not have been exposed to evidence-based practices, thereby contributing to their learning problems (Fuchs et al., 2004). Tier 1 includes students receiving instruction within general education classrooms, Tier 2 supports are provided for students with more significant behavioral and learning needs (i.e., students are not responsive to instructional approaches provided at Tier 1), and Tier 3 supports tend to be highly individualized and more intensive for students who are not responsive to Tier 2 supports. Due to the relational nature of teacher-delivered interventions that target social, emotional, or behavioral outcomes of students, it is important to assess student responsiveness to Tier 2 and Tier 3 interventions, as the behaviors of these students (e.g., off-task, disruptive problem behaviors, defiance) may interfere with teachers’ attempts to engage these students.

While student responsiveness has been examined within treatment fidelity models, our understanding of student responsiveness as a construct remains elusive due partly to definitional inconsistencies in the literature. Thus, we conceptualize student responsiveness to teachers’ attempts to deliver an intervention as a necessary and critical component to help understand the efficacy of interventions that target students’ behavioral, social, or emotional outcomes, particularly at Tiers 2 and 3. We combine samples from four randomized controlled trials to examine the relationship between student responsiveness to teacher attempts to deliver BEST in CLASS, a Tier 2 intervention, and student outcomes. An essential aspect of BEST in CLASS is increasing the amount and quality of teacher interactions with focal students; thus, student responsiveness to teachers’ attempts to deliver the BEST in CLASS practices provides an opportunity to investigate student responsiveness as a potential change mechanism for student outcomes. We hypothesized that teacher delivery of an EBP, BEST in CLASS (adherence and competence), would be indirectly related to decreases in student problem behavior via an influence on student responsiveness. Although our model was tested at a single timepoint (i.e., post-test) and therefore does not support causal inference, our investigation reflects a novel and significant contribution to the study of treatment fidelity, particularly the relationship between student responsiveness and student outcomes.

Method

Sample

The present study includes teacher and student participants from four federally funded research studies testing an intervention designed to address the behavioral, social, or emotional needs of young students who demonstrate persistent and intensive challenging behaviors in classroom settings (Conroy et al., 2022a; Sutherland et al., 2018a, 2020a, b) as well as ongoing multi-site randomized controlled trial examining the efficacy of BEST in CLASS. The first study was a large 4-year randomized controlled trial of BEST in CLASS with children at risk for EBD in early childhood classrooms. Across the 4-year study, teacher and child participants were recruited from early childhood programs in a Mid-Atlantic and a Southern state. The second study was a 1-year randomized controlled trial in which teacher and child participants were recruited from early childhood programs in Mid-Atlantic and Southern states. This trial examined BEST in CLASS in both in-person and web-based conditions in early childhood classrooms. Both of these studies were conducted in early childhood programs serving children from income-eligible families (e.g., Head Start), with over 96% of the programs being either federally or state-funded. The third study was a 1-year randomized controlled trial in which teacher and student participants were recruited from elementary schools in a Mid-Atlantic state. This trial tested BEST in CLASS, adapted from the early childhood intervention, to support elementary school teachers’ use of evidence-based practices with students with and at risk for EBD. The final study included teachers and students from a Mid-Atlantic and a Southern state recruited to participate in a larger randomized controlled trial of the elementary version of BEST in CLASS. All studies used the same student screening procedures, overlapped the evidence-based practices on which teachers were trained (detailed below), and collected teacher reports and observational data during the same time frames each year. In addition, the annual timelines for each study were similar (i.e., child screening 1 month after school began, pretest data collection, random assignment to condition, teacher training, practice-based coaching, post-test assessments), and the principal investigators conducted all intervention and staff training for each of the different studies. The associated human participants’ protection boards approved all study activities.

Teachers

In all four studies, teachers were eligible to participate if they (a) taught in early childhood or Kindergarten to third-grade classrooms, (b) served at least one child identified as being at risk of EBD, and (c) consented to participate. The present study includes 156 teachers who predominantly (98%) identified as female (n = 111 early childhood teachers; n = 45 elementary school teachers). Among the early childhood teachers, 43% self-identified as African American/Black, 46% as White, 3% as Asian/Pacific Islander, and 5% as Hispanic/Latino, and 2% as another race, and 1% was not reported. The early childhood teachers ranged in age from 18 to 25 (6%), to over 55 (14%), with other teachers between the ages of 26 and 35 (32%), 36 and 45 (25%), and 46 and 55 (18%) and 5% who preferred not to list their age. The early childhood teachers had an average of 11.62 years of teaching experience (SD = 9.44; range = 0–43; teachers reported on years of teaching experience, not counting the current year); 1% held a high school level education, 31% had an associate’s degree, 39% held a bachelor’s degree, 25% held a master’s degree, 1% held a doctoral degree, and 4% reported other educational levels. Eleven elementary teachers taught in Kindergarten, 16 taught in 1st-grade, 9 taught in 2nd-grade, and 9 taught in 3rd-grade classrooms. These teachers ranged in age from 18 to 25 (18%), to over 55 (6%), with other teachers between the ages of 26 and 35 (38%), 36 and 45 (18%), and 46 and 55 (20%); they self-identified as 25% African American/Black, 67% as White, 4% as Asian/Pacific Islander, and 4% as other; 7% of these teachers also self-identified as Hispanic/Latino. Most elementary school teachers held a bachelor’s degree (44%) or a master’s degree (53%). Finally, the elementary school teachers had an average of 9.85 years of teaching experience (SD = 9.76; range = 0–38; teachers reported on years of teaching experience, not counting the current year). Teachers were given $100–$400 for their participation (amount varied by study).

Students

In all four studies, teachers selected up to three focal students in their classrooms who exhibited externalizing problem behavior. Eligible students were (a) enrolled in a participating teacher’s classroom, (b) exhibited problem behaviors that interfered with participation in the classroom as indicated by systematic screening, and (c) had parent/guardian consent to participate. This study sample included 355 students. Student sample demographics included 65% African American/Black, 20% White, 6% other ethnicities, 5% Hispanic/Latino, and 4% percent unreported. Most participating students were male (64%). One student was dropped from analyses because they had missing data on all study variables.

Student Screening

Across studies, screening began approximately 1 month after the start of school. Teachers nominated up to five students who engaged in problem behavior, and caregiver consent was obtained. Systematic screening for the risk of EBDs took place using the Early Screening Project (ESP; in early childhood classrooms; Feil et al., 1995) and the Systematic Screening for Behavior Disorders (SSBD; in Kindergarten to 3rd-grade classrooms; Walker et al., 2014). The ESP and SSBD are both multi-gate screening systems designed to identify students who are at risk of adverse developmental outcomes associated with their behavior patterns. The first two stages (used given the scope of the intervention) combine teacher ratings of frequency and intensity of student adjustment problems in their classroom. The risk assessment included raw scores and applying risk criteria to scores (see Feil et al. (1995) for scoring criteria for the ESP and Walker et al. (2014) for the SSBD). Students were screened for critical events, aggressive, adaptive, and maladaptive behavior. See Table 1 for descriptive data of these four scales for each of the four studies. After the screening, up to three students per classroom were selected to participate. Following the screening, study measures were collected for time point 1 in October–December and again for time point 2 in April–June.

Table 1 Descriptive data for child/student SSBD scores across the present study samples

Measures

Student Problem Behavior

Student problem behavior was assessed with the Social Skills Improvement System-Rating Scales (SSIS-RS; Gresham & Elliott, 2008). The SSIS-RS is a 76-item teacher-report measure evaluating young students’ social skills and problem behaviors. The Problem Behaviors scale consists of five subscales, namely externalizing, bullying, hyperactivity/inattention, internalizing, and autism spectrum. Teachers rate items on a 4-point Likert scale (0 = never to 3 = almost always), indicating how frequently students exhibit behaviors; higher scores indicate more problem behavior. Cronbach’s alpha levels did not differ across studies and ranged from 0.89 to 0.95 for Problem Behavior at the pretest and post-test (note that the early childhood BEST in CLASS Web project did not report alphas due to the small sample size; Conroy et al., 2022a). The present study used standardized scores (standardized by child age and gender) and integrated the data across all four studies.

Treatment Fidelity

The thoroughness, frequency, and quality of teacher delivery of practices to the focal student or group in which the focal student was present were measured in early childhood classrooms with the BEST in CLASS Adherence and Competence Scale (BiCACS; Sutherland et al., 2014) and in the elementary school classrooms with the Treatment Integrity Instrument for Elementary School Classrooms (TIES; Sutherland et al., 2017). The BEST in CLASS Adherence and Competence Scale and TIES are observational measures in which raters assess teachers’ extensiveness (i.e., adherence) and quality of delivery (i.e., competence) of evidence-based practices using a 7-point Likert-type scale. Of particular importance for the current study, the TIES is an updated version of the BiCACS, with new items added. However, the procedures for collecting data (e.g., scale, anchors, item definitions) were the same for both measures. The thoroughness and frequency of teacher delivery of evidence-based practices were measured with the adherence dimension of the BiCACS and TIES. Anchors on the adherence scale range from not at all to very extensive. The quality of teacher delivery of evidence-based practices was measured with the competence dimension of the BiCACS and TIES. Anchors on the competence scale range from very poor to excellent (see Sutherland et al. (2014) for a detailed description of the full measure). Coders are trained to code competence from an anchor of “4” (i.e., the practice delivery was “ok”) and to increase or decrease their scoring based upon quality indicators (e.g., the timing of delivery, developmental appropriateness of language, teacher affect). Of note, the items of the adherence and competence scales are the same within the BiCACS and the TIES, with coders noting adherence and competence scores for each item. The items on the BiCACS and TIES correspond with the practices for the BEST in CLASS-Pre-K intervention (i.e., Rules, Precorrection, Opportunities to Respond, Behavior Specific Praise, Corrective Feedback, Instructive Feedback) on the BiCACS and practices for BEST in CLASS – Elementary (i.e., Supportive Relationships, Rules, Precorrection, Opportunities to Respond, Praise) on the TIES. Reliability was assessed using secondary observers for approximately 20% of observations. ICCs were computed for each item on each scale. Across the four studies, the mean ICC for the adherence scale ranged from 0.63 to 0.82, and the mean ICC for the competence scale ranged from 0.49 to 0.61. See Table 2 for the ICCs by study. As stated, the BiCACS and TIES both include ratings of the BEST in CLASS practices and use the same anchors for each of the subscales, as such data for these observational measures were integrated across the four studies.

Table 2 ICCs for observational data across the present study samples

Student Responsiveness

Student responsiveness was assessed as part of the BiCACS (Sutherland et al., 2014) and the TIES (Sutherland et al., 2017). Student responsiveness is defined as the extent to which the focal student responds to the teacher’s attempts to engage the student (e.g., in a lesson, activity, or conversation). Characteristics of focal student responsiveness include the student participating in conversations with the teacher, participating in group discussion or activity, physically orienting themselves in the direction of the teacher while the teacher is leading an activity or lesson and delivering the intervention, responding to feedback or questions from the teacher in a positive manner, demonstrating enthusiasm in responding to teacher requests, elaborating or asking about points made by the teacher, and demonstrating understanding via correct response and/or compliance. We operationalized student responsiveness based upon observable behaviors in order to hopefully mitigate developmental differences that might exist in student responsiveness across preschool and early elementary settings. Across all four studies, observers recorded the extent to which the student’s behavior indicated responsiveness to the teacher’s attempts to engage the student when delivering the intervention. A 7-point scale was used to measure responsiveness, with 7 reflecting a student demonstrating characteristics of responsiveness every time (or almost every time) the teacher attempts to engage the student and a 1 reflecting a student who was not at all responsive to the teacher’s engagement attempts. Across the four studies, the mean ICC for responsiveness ranged from 0.62 to 0.70. See Table 3 for the ICCs by each study. Data were integrated for student responsiveness across all four studies. Notably, student responsiveness was not assessed in the first 2 years of the BEST in CLASS-Pre-K efficacy study (Sutherland et al., 2018a). The COVID-19 pandemic and ensuing school closures meant student responsiveness data was not collected in year 2 of the BEST in CLASS-Elementary RCT.

BEST in CLASS

BEST in CLASS is a Tier 2 intervention designed to improve the interactions and relationships between teachers and students with or at risk for EBD to reduce problem behaviors (Sutherland et al., 2020a, b). Teachers receive a 1-day teacher workshop, a teacher resource manual, and practice-based coaching (14–16 weeks) to increase the quantity and quality of delivery of evidence-informed practices with focal students. In the early childhood version, teachers are trained and coached to use six practices (i.e., Rules, Precorrection, Opportunities to Respond, Behavior Specific Praise, Instructive Feedback, and Corrective Feedback). In the elementary model, teachers are trained and coached to use five practices (i.e., Supportive Relationships, Rules, Precorrection Opportunities to Respond, and Praise). In the elementary model, teachers are also trained in the BEST in CLASS Home-School Partnership model, and coaches support teachers’ efforts in delivering practices and partnering with caregivers. By implementing the practice-based coaching component, coaches meet with teachers weekly to provide supportive and constructive feedback and share data on the teacher’s use of BEST in CLASS practices with focal students in their classroom.

Data Analyses

To examine whether student responsiveness indirectly influenced the relation between treatment effects and student problem behavior, we performed a series of regression analyses, testing direct and indirect models, using Mplus version 8.0 (Muthén & Muthén, 2007). To test the direct models, we evaluated parameter estimates against 0.05. To evaluate indirect effect estimates, we calculated 95% confidence intervals (CIs) using 10,000 bias-corrected bootstrap samples, which accounts for nonnormality and inflated type I error rates that can result when multiplying two parameter estimates (MacKinnon et al., 2004). We considered the indirect effect as statistically supported at 0.05 when the 95% CI did not contain zero. We regressed adherence and competence in all models on teacher education level and grade (coded as 0 = early childhood; 1 = Kindergarten; 2 = 1st grade; 3 = 2nd grade; and 4 = 3rd grade). We also regressed student gender and student race/ethnicity on student responsiveness and student problem behavior. These covariates were included because each is known to influence teaching practices and to control for potential differences between samples. Additionally, we controlled for pretest assessments of all variables on the respective post-test variable.

It is important to note that study data were two-level, with students nested within teachers/classrooms. Mplus does not currently allow bootstrapping of two-level models with indirect effects; thus, as a sensitivity test, we also tested the models using linear regressions with cluster (teacher)-robust standard errors (Huber, 1967; White & Macdonald., 1980). Using this specification accounts for the nonindependence of students by adjusting the standard error estimates and, therefore, removes variance due to the repetition of children across teachers (Asparouhov & Muthen, 2006). Without accounting for the nonindependence in the data, the estimated standard errors would be inflated, resulting in a greater chance of committing a type I error. Study results using this correction were consistent with the original models. As an additional sensitivity test, we conducted study models using a MLSEM framework, which accounts for clustering with the most advanced methods. Study results did not change significantly; at the within level, we found that mediation existed for both adherence and competence. However, at the between level mediation was not statistically significant.

Results

First, we conducted preliminary analyses to examine the descriptive statistics, skewness, and kurtosis of all study variables. An absolute skew value larger than 2 or smaller than − 2 or an absolute kurtosis value larger than 7 or smaller than − 7 may indicate nonnormality (Kim, 2013). All variables were within this range and therefore did not require any transformations. See Table 3

Table 3 Descriptives for all study variables

Missing data patterns revealed that at the pretest, 4 students were missing data on problem behavior, 74 were missing data on student responsiveness, 13 were missing data on competence, and 8 were missing data on adherence. At the post-test assessment, 76 students were missing data on problem behavior, 151 were missing data on student responsiveness, and 91 were missing data on adherence and competence, partly due to school closures associated with the COVID-19 pandemic in year 2 of the elementary RCT. Independent samples t-tests and a one-way ANOVA showed that students for whom complete data were unavailable did not differ from students with complete data on demographics (e.g., race/ethnicity, gender). The full information maximum likelihood (FIML) estimator accounted for these missing data. This estimator retains the statistical power of the full analytic sample while minimizing bias in parameter estimates (Enders, 2001).

We conducted several sets of regression analyses to examine whether student responsiveness influenced the relation between each dimension of treatment integrity and student problem behavior. Our first model was a test of the direct paths between each dimension of treatment integrity and student problem behavior. We then tested the paths between each dimension of treatment integrity and student responsiveness. Finally, we tested the indirect model, in which we specified student responsiveness as an intervening variable between each dimension of treatment integrity and student problem behavior. Results from the regression analysis are presented in Table 4.

Table 4 Model results

The effect of adherence on student problem behavior was not significant (B = − 0.12, p = 0.86). The effect of competence on student problem was trending toward significance (B = − 1.70, p = 0.06). The effect of adherence on student responsiveness was significant (B = 0.13, p < 0.05). The effect of competence on student responsiveness was significant (B = 0.47, p < 0.001). And the effect of student responsiveness on student problem behavior was significant (B = − 2.43, p < 0.05).

Bootstrapping analysis indicated that competence had an indirect effect on student problem behavior through student responsiveness (indirect estimate = − 1.16, p < 0.05; [95% confidence interval [CI] = − 2.19 to − 0.38). Bootstrapping analysis also indicated that adherence had an indirect effect on student problem behavior through student responsiveness (indirect estimate = − 0.33, p = 0.13; [95% confidence interval [CI] = − 0.80 to − 0.07) (Table 5).

Table 5 Indirect effects

Discussion

This study aimed to examine how teachers’ fidelity to implementing a Tier 2 intervention indirectly influenced students’ treatment outcomes through student responsiveness using data from four randomized controlled trials of BEST in CLASS. We characterized the intervention via two dimensions of treatment fidelity, adherence and competence, and student outcomes were assessed via teacher reports of student problem behavior. Adherence, competence, and student responsiveness were all assessed via direct observations. Findings suggest that teacher adherence and competence in delivering BEST in CLASS practices were associated with reductions in problem behavior from pretest to post-test via student responsiveness. Following a discussion of these findings, we will discuss the current study’s limitations and implications for future research.

Our hypothesis that teacher delivery of BEST in CLASS (adherence and competence) would indirectly be related to decreases in student problem behavior via an influence on student responsiveness was confirmed. In the current study, adherence was not directly related to reductions in student problem behavior, while teacher competence of delivery of BEST in CLASS practices was trending toward significance; the associations between teacher competence and adherence and reductions in problem behavior appear to operate via student responsiveness to teachers’ delivery of the intervention practices. This is not surprising, as adherence has been found to be associated with positive child outcomes (see Durlak & DuPre, 2008), and previous research on BEST in CLASS (Sutherland et al., 2018b) has found that teacher competence of delivery is associated with positive child treatment outcomes. Our findings thus highlight the potential importance of student responsiveness in intervention effectiveness, and the role played by adherence and teacher competence of delivery.

While interpreting these findings, it is important to note that the confidence interval for adherence (− 0.80 to − 0.07) is close to including zero, so these results should be viewed with some caution. That said, it is also essential to understand how adherence and competence are measured using the BiCACS and TIES, as these procedures may have influenced the results. To illustrate, the competence of delivery (how well a practice is delivered) is only scored when an observer codes a practice as occurring (i.e., when adherence is noted). Thus, while the findings indicate that adherence alone was not associated with student outcomes, it is important to note that this does not mean that adherence is not essential. Since competence can only be assessed when adherence is coded, competence is dependent upon adherence. At the same time, many of the quality indicators of BEST in CLASS practices that teachers are trained and coached on, reflected in how observers score competence, may be particularly suited to student responsiveness.

To illustrate, when scoring competence for the practice element Supportive Relationships, observers are trained in quality indicators such as warmth in voice tone and affect. These quality indicator examples reflect a teacher responsive to student characteristics or needs. In contrast, high competence Opportunities to Respond are characterized by a teacher’s skillfulness and timing of delivery of the Opportunity to Respond. Additional competence characteristics of delivery of Opportunities to Respond include being developmentally appropriate with appropriate wait time. This approach to treatment fidelity measurement goes beyond simple dichotomous checklists that indicate whether a teacher delivered an intervention component (Goncy et al., 2015). Therefore, findings from the current study suggest that how a teacher delivers intervention components may be as, if not more, important than if they deliver the components. Additionally, these findings suggest that how a teacher delivers the intervention may influence the responsiveness of students receiving the intervention.

How a teacher delivers intervention components may be essential for young children and students with and at risk for EBDs. As mentioned earlier, characteristics of students with EBD (e.g., off-task, disruptive problem behaviors, defiance) often indicate non-responsiveness to intervention attempts. If there are characteristics of teacher delivery of practices associated with increased student responsiveness, this would have significant implications for affecting outcomes for these students. That is, the more responsive students are to teachers’ attempts to engage them, the more effective intervention efforts will be.

Limitations and Implications for Future Research

While the current study’s findings are intriguing, several limitations should be considered while interpreting the results. First, our model was tested at a single timepoint (i.e., post-test) and therefore does not support causal inference. Within both the BiCACS and TIES, student responsiveness was collected at pretest and post-test, precluding the examination of the student responsiveness item as a mediator in the current study due to the timing of the assessments. Future work should strive to establish temporal ordering of the predictor, meditator, and outcome variables to investigate student responsiveness as a true mediator of treatment effects. Second, while the data integration used in the current study is novel, allowing us to examine our research question across several study samples, we were limited to examining variables present in each study. Third, our sample size was smaller than desired due to the lack of student responsiveness data in the first 2 years of the BEST in CLASS-Pre-K efficacy trial (Sutherland et al., 2018a). Fourth, due to the lack of post-test data during year 2 of the elementary study due to the COVID-19 pandemic, data were imputed for those participants for whom we had pretest data. Finally, the mean ICC for the competence scale was low (ranged from 0.49 to 0.61 across studies). Inter-rater reliability for competence tends to be lower than that of adherence (see Carroll et al., 2000; Hogue et al., 2008) and while the ICC estimates of competence in the current study are consistent with previous studies (e.g., Barber et al., 1996; Hogue et al., 2008), findings should nonetheless be interpreted in consideration of this reliability score.

Within multi-tiered systems of support such as PBIS, responsiveness to intervention is used to determine the appropriate level of support a student needs (McIntosh et al., 2009). At the same time, the field has struggled to identify what constitutes a student’s responsiveness to intervention (Lacks & Watson, 2018; Speece & Walker, 2007), mainly related to students’ social, emotional, or behavioral outcomes. Future work may consider observational assessments of student responsiveness, such as those used in the current study, to understand better the role of responsiveness in determining student needs within PBIS models and intervention effectiveness. In addition, it may be necessary for those who prepare and support teachers to understand better how to build teacher capacity to use both practices (adherence) and teacher competence in practice delivery through practice-based coaching (i.e., Snyder et al., 2022), since these dimensions of treatment fidelity seem particularly important to both student responsiveness and intervention outcomes.

Conclusion

Given the challenges teachers face in providing effective instruction and intervention to students with and at risk for EBD, and these students’ poor outcomes in comparison to their peers both with and without disabilities, identifying mechanisms that teachers can leverage to improve both intervention delivery and student outcomes is a critical goal for our field. This study adds to a growing literature on the important role of teacher adherence and competence in treatment fidelity models and student responsiveness in intervention effects. Student responsiveness appears to be an understudied but potentially significant variable that researchers should consider when developing and evaluating interventions for students with and at risk for EBD.