Home

Friday, August 23, 2019

Uncovering Causality III: Regression Discontinuity Designs

One of the nice things about running a blog is that you get to see which of your posts struck a chord and got more views than you’d anticipated. One of the less nice things is that you’re also able to see which posts went down like a lead balloon. Perhaps unsurprisingly, into this latter category falls my Uncovering Causality posts where I try to explain various statistical techniques in a way that’s easy to understand. Who would have thought Econometrics would be so unappealing?!

Not to be deterred, I’ve decided to continue my Uncovering Causality series because a) I think making statistics accessible is important and b) the nerd in me enjoys explaining statistics in a way that's intuitive. So, without further ado, here is a post on Regression Discontinuity Designs (RDDs) – one of the cleverest and most interesting statistical techniques you can come across.

The Selection Problem


Before understanding RDDs, we need to understand the selection problem they seek to address. Suppose we wish to understand the average ‘treatment effect’ of going to hospital – do hospital visits make us better or worse? One way we could do this is to compare the health status of people that have been recently visited hospital vs. those who haven’t. We can rank health status on a scale of 0-9, where 9 = perfect health and 0 = death’s door. Suppose we observe the following:


In words, the group of people that hasn’t been to hospital is healthier than the group that has. From this, can we conclude that hospitals are harmful to our health?

Thankfully, the answer is no. Almost de facto, people who visit hospital are less healthy than those who don’t, so it is unfair to compare these groups ex post. In an ideal world, we would want to compare the health status of the treated group (4.4) to what it would have been had they not been treated. Unfortunately, we cannot observe this ‘counterfactual’ – so we have to resort to clever statistical techniques to tease out what it might have been.

Overcoming the Selection Problem


The best way to do this is to ensure that treatment is randomised. Randomised Controlled Trials offer one option: by randomly deciding who gets treatment, we ensure that both treatment and control groups are as similar as possible ex ante, allowing them to be reliably compared ex post. However, for practical and ethical reasons, RCTs are not always viable (the ethical issues with randomising hospital treatment hardly need stating). Fortunately, there are other ways to generate randomness outside of an experimental setting.

Regression Discontinuity Designs


In October, 000s of 10-year-old kids will line up outside my old school to take the 11+: a gruelling 3-hour exam based on non-verbal and verbal reasoning. If they score above a certain mark, they are admitted to grammar school; if not, they go to comprehensive school[1] (for simplicity, assume there are no private schools in this world). To estimate the treatment effect of selective education, we might plot children’s 11+ mark against the number of UCAS points[2] they achieve upon completing secondary school.


From this set-up, how can we calculate the treatment effect of going to grammar school? It would be unfair to simply compare the UCAS scores of kids who got in to grammar school vs. those who didn’t. After all, kids who get into grammar school at age 11 are likely more intelligent and affluent to begin with – a classic selection problem.

However, what about the kids who just missed out vs. the kids who just about made it? Presumably these kids are pretty similar. Whether they got in or not may have depended on whether the right questions came up, whether they slept well the night before or whether they guessed correctly on a number of multiple-choice questions. These factors are largely random: enabling us to identify a ‘treatment effect’ of selective education – at least for the kids that scored near the cut-off.[3]


Early Childhood Development


Early childhood development matters – and assessing the effect of postnatal interventions is vital in informing effective health policy. However, in doing this, it is unfair to compare outcomes of babies that received postnatal treatment vs. those that didn’t – this is the classic healthcare selection problem outlined at the start.

However, what if there was an arbitrary cut-off that determined whether or not they got treated? For example, in Norway and Chile, new-borns below a 1500g cut-off get given additional respiratory and surfactant treatment, while those above it do not. Using the assumption that a 1490g baby is virtually identical to a 1510g baby, Bharadwaj et al. exploit this cut-off to estimate the ‘treatment’ effect of additional post-natal care. The results are displayed below: in both Chile and Norway, babies just below the 1500g cut-off perform significantly better in Maths exams that take place almost a decade later. 


Gaming the System


The biggest risk to RDDs is people gaming the system. The more people can exert influence on which side of the cut-off they fall, the less we can assume that treatment is random. While mother’s cannot meaningfully control their child's birthweight to ensure it falls below a certain cut-off, they might nonetheless be able to control which birthweight is recorded. Doctors in certain hospitals may be easier to convince to record a false measurement to ensure the baby gets extra care. If this happens, kids born just below the cut-off are no longer born in a random sub-section of hospitals – compromising our ability to infer a reliable treatment effect.

Despite this flaw, RDDs provide a clever way to untangle causality outside of an expensive experimental context. Their surge in popularity has no doubt contributed to the recent ‘credibility revolution’ in empirical economics. Valid RDDs are not easy to come by, however – like instrumental variables, identifying a good ‘cut-off’ requires imagination and ingenuity. This is partly why I like them: they force the researcher to shut the Econometrics textbook and think creatively about the world around them.  

Word count: 993 words

References

  • Bharadwaj et al. (2013), ‘Early Life Health Interventions and Academic Achievement’

For an example of a really creative RDD, check out this paper by Melissa Dell about the long-run impact of forced mining systems in Peru and Bolivia: https://scholar.harvard.edu/files/dell/files/ecta8121_0.pdf


[1] For my non-British readers: grammar schools are selective state schools (i.e. you have to pass a test to get in); comprehensive schools are non-selective state schools
[2] Again, for my non-British readers: UCAS points are what our grades are converted to when we apply to University, sort of like a GPA
[3] In the jargon, this is known as a Local Average Treatment Effect, since the effect is only estimated on a certain sub-section of the population i.e. those who scored near the cut-off

No comments:

Post a Comment