# Contextual Effects in Teaching and Learning Research

Steffen Zitzmann

*Teaching and learning research attempts to identify connections between the determinants of successful learning and learning success. Learning processes can be optimized based on these findings. The data collected for this purpose often have a multi-level structure, in which pupils are nested in school classes. Contextual features (e.g. quality of teaching) play an important role among the determinants of successful learning.*

A popular method of capturing the effect of a context feature is to first aggregate student variables to a higher level of analysis. For example, students are questioned on the quality of teaching by means of various items and their assessments are averaged both for the items and the learners of a class. The class average value is then included as an additional predictor in the multi-level analysis for predicting learning success.

The Department of Educational Measurement at the IPN is dedicated to the question of how to achieve a precise estimation of the influence of contextual characteristics on learning success (or other school outcomes).

In my dissertation, I have shown that the described aggregation generally results in an unreliable measure, which can lead to a strongly distorted estimation (bias) of the effect of the context characteristic on a dependent variable (e.g. learning success) when used for predicting the dependent variables. To illustrate the bias problem, one can imagine a darts game in which a dart player tries to hit the center of the dartboard (true effect in the population), but his arrows systematically and far miss the target. The individual litters correspond to the results of individual studies.

One way of solving the bias problem is to use a multi-level structural equation model (ML-SEM). These models are very popular in teaching and learning research. However, it is frequently pointed out that these models place comparatively high demands on the data and that it can lead to an inaccurate estimation of the context effect if these requirements are not met. If we stick with the image of the dart player, this means that the dart player hits "the middle" (no bias), but his arrows can miss the target by a lot. In other words, the outcome of a single study may lead to wrong conclusions as to the direction or strength of the effect of the lesson characteristic on the learning outcome.

I have developed an approach in my dissertation which is based on a suitable weighting of the (estimated) variance of the context characteristic to obtain a more accurate estimate. This approach has been combined with various methods for estimating ML-SEM. The resulting procedures are a compromise of bias and accuracy, where the darts do not miss the target by quite as much. For those who want to know more, the methods include multi-step methods such as factor score regression with regularized variance, Bayesian method with an appropriate prior distribution for the variance, and the Maximum Likelihood Method with a suitable lower limit for the variance.

The limits of the maximum likelihood method for estimating ML-SEM were "explored" in several simulations with conditions that occur in research practice but are challenging for the traditional Maximum Likelihood Method and then compared with the newly developed approach.

A simulation is an evaluation study in which one or more methods are compared on the basis of a large number (e.g. 1000) of artificial data sets. Because the data generating model (population model) is known, the results can be directly compared with the true effect in the population. A typical measure of estimation accuracy is the root of the mean square deviation (RMSE) of the results of the 1000 artificial data sets from the true effect in the population.

The greatest inaccuracy occurs under the most challenging condition (intraclass correlation ICC =. 05, class size n = 5). The measure for the estimation accuracy, the RMSE, of the Maximum Likelihood Method (ML) is twice as high as the approach developed for its reduction, which is presented here in the form of a Bayesian method (Bayes) with an appropriate prior distribution for the variance. However, the extent of the reduction decreases with increasing or growing ICC. The main finding can thus be summarized as follows: The newly developed approach can lead to a significant gain in accuracy (i.e., lower RMSE) in estimating the effect of the teaching feature on learning outcome. For those who would like to know more: This gain in accuracy is also similar for the remaining methods (factor score regression with regularized variance, Maximum Likelihood Method with a suitable lower limit for the variance).

*Two take-home messages for research practice can be derived from my dissertation.*

Firstly, an unreliable measure (mean value of items and students in a school class) can lead to a strongly distorted estimate of the effect of the context feature on a dependent variable (e.g. learning success). Therefore, a less distorted method (e.g. ML-SEM) should be used to estimate context effects.

Secondly, the use of an ML-SEM may be associated with a low estimation accuracy (i.e. high RMSE) if the Maximum Likelihood Method is used. To reduce the RMSE, an approach should therefore be used which is based on a suitable weighting of the (estimated) variance of the context characteristic.

And what is the practical significance of these findings? These questions about methods of analysis need to be taken into account in teaching and learning research, especially when conclusions are drawn from the results of individual studies that lead to politically relevant decisions.