Randomised and quasi-experimental studies in the public sector: a mapping study
Norwegian Institute of Public Health (FHI) was commissioned to coordinate a survey of how relevant government agencies use randomized trials (RCTs) and quasi-experimental methods to strengthen the evidence base for public policy decisions. The mapping exercise aims to identify strengths and weaknesses with the use of such evaluation methods, and barriers and facilitators to their use.
Most government agencies that contributed to this survey have carried out or commissioned randomized and quasi-experimental studies. Organizational, educational, and financial interventions are examples of measures that have been evaluated using such methods.
Close dialogue between implementers, ministries and political leadership has facilitated better impact evaluations. The potential is great for sharing knowledge between sectors about how impact evaluations can be prepared and carried out. However, several agencies experience a political inclination to large scale roll out of interventions with uncertain effects rather than gradual implementation, which could allow for evaluating effectiveness with randomized or quasi-experimental methods, before scale-up.
Randomized and quasi-experimental studies face many legal, ethical, political and practical challenges, and they may be resource intensive. Several agencies point out the need for legislation that facilitates the conduct of impact evaluations, with clearer criteria for a) when differential treatment may be considered justifiable, b) when informed consent requirements can be waived, and c) managing privacy concerns related to data access.
Background and process
In the allocation letter for 2023, the Norwegian Institute of Public Health (FHI) was commissioned to coordinate a survey of how relevant government agencies use randomized trials (RCTs) and quasi-experimental methods to strengthen the evidence base for public policy decisions. The mapping exercise aims to identify strengths and weaknesses with the use of such evaluation methods, and barriers and facilitators to their use. The main contributors to this mapping were the Directorate for Children, Youth and Family Affairs (Bufdir), the Norwegian Labour and Welfare Administration (NAV), the Norwegian Agency for Development Cooperation (Norad), the Norwegian Tax Administration, the Directorate for Education and Training (Udir) and FHI.
The agencies that participated in this survey emphasized that RCTs and quasi-experimental studies have strengths and weaknesses that affect the extent to which the methods provide reliable conclusions. For RCTs, it is pointed out that random allocation to intervention or control group minimizes systematic differences between those who receive and those who do not receive the intervention. Thus, reliable conclusions can be drawn about whether observed differences between the groups can be attributed to the effect of the measure that is being studied. At the same time, it is emphasized that quasi-experimental methods can also provide valid effect estimates given certain assumptions, such as variation in the implementation of the intervention and access to detailed data and linking of different data sources.
One of the main methodological challenges with RCTs is that the results are not necessarily transferable outside the context in which the study was carried out. The agencies have also experienced that the effect of an intervention aimed at groups (e.g. schools or municipalities) may "leak" between the intervention and control groups, and thus produce misleading results. The primary methodological limitation of quasi-experimental studies is the risk for biased results due to differences between the groups, not related to the intervention being studied.
Most of the agencies that contributed to the survey have experience conducting or commissioning randomized and quasi-experimental studies. Organizational, educational, and financial interventions, as well as interventions for public policy governance (e.g. measures to improve adherence to laws and regulation) have been evaluated using such methods. Over the last five years, the estimated number of such evaluations ranged from 1 to 5 studies for some agencies, and exceeded 15 for others. The agencies that had completed >15, were those that for that usually designed and carried out studies themselves.
The agencies report similar barriers for conducting RCTs and quasi-experimental studies, like access to necessary data and political acceptance for gradual roll-out and differential treatment when introducing new measures. In some cases, there may be specific challenges associated with one of the methods. Implementing RCTs often encounters legal, ethical, and practical challenges. A significant practical challenge, especially for many RCTs, is to include a large enough sample to ensure the necessary statistical power. Quasi-experimental studies are often more feasible, which highlights the importance of considering RCTs and quasi-experimental methods as complementary approaches.
For both methods, access to registry data and linking of registries are key challenges. Well-conducted quasi-experimental studies often require detailed data and extensive linking of data sources. Many such studies are prevented by the fact that it is time-consuming and difficult to access such data. When surveys are used as data source, achieving sufficient response rates is a common problem. Furthermore, the absence of comparable and precise data that can serve as outcome measures is also a challenge, especially if there is a lack of frequent and continuous measurements or if existing data sources are incomplete or not developed for the purpose of research.
Legal challenges include
- informed consent requirements, which can mean that population-based studies are impossible to carry out
- that studies where the participants are divided into two differentially treated groups may come into conflict with principles of equal treatment and legal provisions that protect basic rights and services
- sector-specific challenges that prevent funding or the practical implementation of impact evaluations
Ethical considerations include
- resistance to not offering the control group a measure that is expected to have positive effects
- that studies may entail disproportionate risk or burden for the participants
- objections to studies where the participants do not have the opportunity to withdraw their participation, either because they are not aware that they are participating in a study, or because it is not practically possible to withdraw
Ethical assessments of the one and the same intervention are currently assessed differently, depending on the legislation that is applied, e.g. with regard to informed consent requirements. Well-defined legal and ethical frameworks are needed to provide guidance for assessing the societal benefit of a study against its potential risks, burdens, and privacy implications.
Political challenges include
- the decision-makers' desire to appear efficient and action oriented, which does not allow for the time and space needed to carry out thorough impact evaluations before a measure is implemented on a large scale
- the need for heightened political awareness about how systematic testing, using randomized or quasi-experimental studies, can be an important tool for reducing the uncertainty about the effects of a measure
- political processes, such as changes in government, budget settlements and political decisions, can change the context in which the measure is evaluated, which may lead to the control groups being exposed to the measure under study
Planning and carrying out more and better impact evaluations can be done through close dialogue between agencies, ministries, and researchers. Enhanced exchange of information, along with promoting changes in attitudes, can lead to greater cross-party acceptance for a time-limited, systematic, and gradual – and if possible, randomized – rollout of reforms and measures.
Cross-party agreement can also facilitate longer-term funding, which are necessary for major research initiatives that can generate knowledge about public policy interventions and programs. The survey findings highlight positive examples, such as the LÆREEFFEKT program funded by the Norwegian Research Council and the Competencies Program led by the Directorate for Higher Education and Skills, where resources for systematic evaluation have been prioritized.
Especially for RCTs, other challenges of a practical, economic, and psychological nature also apply, e.g.
- that implementation in complex organizations and services can be practically challenging for the staff involved in service delivery, and may affect day-to-day operations
- that conducting the studies can be expensive
- scepticism among service providers and in the general population towards research studies that involve differential treatment between those who receive and those who do not receive the measure
The cost of carrying out randomized and quasi-experimental studies should be assessed against the savings achieved by not implementing measures that have uncertain or undesirable effects. The participation of those who are affected by the study, such as teaching staff, health care workers, case managers or the general population, is necessary to ensure acceptance for time-limited differential treatment to gain more and better knowledge about the effect of interventions. Appropriate user involvement supported by implementation studies can help ensure that studies do not entail a disproportionate burden on the services affected. Use of available data infrastructure, e.g. national registers, can make studies more administratively and practically feasible.
To limit the use of resources on studies from which reliable conclusions cannot be drawn, it is important that the planning phase includes a review of the existing evidence base, and emphasis on sound scientific standards. This includes development and pre-registration of protocols with pre-specified analyses, in line with international guidelines. Furthermore, it is important that investments are made in the development of data sources and that access to and linkage across data sources is made possible.
The expertise needed to carry out impact evaluations varies across agencies, but views align on two areas. First, higher competency in research methodology is required, including the design of impact studies. Second, more legal expertise is needed to understand the current regulations, clarify the legal basis for studies, and to make use of the room to maneuver within the existing legal framework.
Our mapping exercise has identified several examples of how close dialogue between agencies, ministries and the political leadership has facilitated better impact evaluations. There is a potential for improved sharing of knowledge between sectors about how impact evaluations can be prepared and carried out. However, several agencies experience a political inclination to large scale roll out of interventions with uncertain effects rather than gradual introduction. which could allow for evaluating effectiveness with randomized or quasi-experimental methods, before scale-up. Randomized and quasi-experimental studies face many legal, ethical, political, and practical challenges, and they can be resource intensive. For many interventions, quasi-experimental evaluation is more feasible than conducting an RCT. To achieve credible results with quasi-experimental methods, several prerequisites must be met, including variation in the implementation of the interventions, access to detailed data and linkages across data sources, and sufficient methodological expertise. Several agencies point out the need for legislation that facilitates the conduct of impact evaluations, with clearer criteria for a) when differential treatment may be considered justifiable, b) when informed consent requirements can be waived, and c) managing privacy concerns related to data access.
A greater political acceptance for systematic, gradual and – if possible – randomized roll-out of reforms would facilitate learning about the effects of interventions. This could lead to more effective and targeted policy making and increase the chances that implemented measures lead to the desired results, and that ineffective measures are discontinued.