What exactly does the word quasi-experimental mean?
Quasi-experiments can be thought of as the real-world version of experiments. In true experiments (randomized control trials), participants are randomized into two groups – one group receives a specific treatment while the other does not (the control group). Because participants are randomized, we can often safely assume that the groups are equivalent to one another before treatment happens. That is, these groups are not significantly different from one another regarding characteristics such as demographics or prior educational experiences, and they share roughly the same average measure on any pre-treatment outcome. For example, if we were interested in the impact of virtual exchange program participation on global perspective-taking, then we could assume that prior to one group participating in virtual exchange, the two groups start at roughly the same place along whatever measure of global perspective-taking we use in our study.
Of course, true experiments are rare in the real world, which complicates things, since virtual exchange programs are designed, implemented, and evaluated in the real world. In this situation, we aren’t able to randomize participants into a group that gets to participate in virtual exchange and another that does not. So, we can’t assume virtual exchange participants are “starting in the same place” before they begin the program. In this case, researchers can make use of one of several quasi-experimental techniques to try to recreate the conditions of a true experiment.
One common quasi-experimental design option is to use demographic, academic, and other available information about participants to weight data based on the likelihood that an individual will participate in virtual exchange (often referred to as propensity score weighting or inverse probability weighting in the research literature). Weighting helps to correct for differences between treatment and control groups statistically in situations where randomization isn’t possible. Of course, these weights are only as good as the individual characteristics that are used to create them, and they can’t account for information that isn’t available. For example, if we do not have a pre-treatment measure of participants’ global-perspective taking, then we can’t account for this information in an analysis. This approach is useful in virtual exchange research because often researchers work with data that they do not collect themselves, such as institutional or classroom records. In this case, researchers do the best they can with data that are intended for purposes other than virtual exchange research.
A second quasi-experimental design option is to collect a pre-treatment measure of whatever the study’s outcome of interest is in addition to the post-treatment measure, creating what researchers refer to as a pre-post design. For example, if we are interested in virtual exchange’s impact on global-perspective taking, then we could measure this outcome for all participants both before and after the treatment group participates in virtual exchange. If we have a pre-treatment measure of our outcome, then we can include it directly in statistical analyses (such as in a regression model) to account for the fact that virtual exchange participants likely start from a different place along this measure compared to non-participants. In this case, we explore whether virtual exchange impacts changes in our outcome (e.g., global-perspective taking) rather than just the post-treatment measure. This particular approach to quasi-experimental design is useful when researchers are collecting their own data and have access to research participants before virtual exchange takes place for the treatment group. Of course, this option is much more time-consuming and organizationally complex compared to the weighting approach described previously.
Why do some studies show a positive impact of virtual exchange and others do not?
The two Stevens Initiative reports mentioned above, both using quasi-experimental design methods, find what seem to be divergent results. The first, the 2022 Virtual Exchange Impact and Learning Report, finds that virtual exchange positively impacts measures such as students’ knowledge of the other, perspective taking, cross-cultural collaboration, self-other overlap, and warm feelings. The second, the Stevens Initiative-sponsored Assessing Access and Outcomes in Community College International Virtual Exchange, finds no significant impact of virtual exchange on students’ self-efficacy, global perspective-taking, or cultural humility. Why do these two studies present such divergent results?
The first reason may be obvious. The outcomes explored in these two studies are different. It is certainly possible that virtual exchange has an impact on cross-cultural collaboration all while not moving the needle at all on participants’ self-efficacy.
A second explanation may have to do with the programmatic characteristics of the virtual exchanges themselves – the implementation of the treatment. As educators who work with virtual exchange programs know well, these programs can vary widely in terms of how long they last, how much contact participants have across cultural contexts, and the depth of their collaborative experiences. Differences we see in research results may have to do with the nature of the programs themselves.
A third reason might have to do with the potential virtual exchange participants. Virtual exchange might produce incredibly positive effects for postsecondary students, but not elementary school students, for example.
And finally, results may differ because of the way the data were analyzed. For example, a study that uses the weighting approach just described may not be able to account for important pieces of information that help explain how treatment and control groups are different from one another before virtual exchange begins simply because this information is not available. A study that can account for students’ pre-treatment scores, for example, is in a much better position to make claims that analyses account for baseline differences among treatment and control participants.