Instrumental Variables
As is often the case, instrumental variables are best
explained by means of example. Suppose I hope to estimate the causal effect of
class sizes on student test scores. If I collect data on both, I may get a scatterplot that looks something like this:
What is the problem with extrapolating a causal relationship
from this result? As before, there may be ‘confounding’ variables that
influence both X and Y, obscuring the true relationship between the two. For
example, low class sizes may be correlated with school wealth (e.g. richer
schools can afford lower student-classroom ratios) which also affects test
scores through other ways (e.g. better textbooks and non-classroom facilities).
One way to avoid this problem is to identify a variable/instrument
(Z) which i) affects class size (X) and ii) is not correlated with anything else that affects test scores (Y). In
econometrics jargon, i) is known as the relevance requirement and ii) is known
as the exogeneity requirement. If both criteria are met, we can effectively
‘isolate’ the random variation in X and identify a true causal effect.
|
For example, in the context of class sizes and test scores,
a candidate instrument could be the occurrence of earthquakes. In places like
California, schools sometimes shut down due to infrastructure damage caused by
periodic earthquakes. While the schools are being repaired, affected students
are temporarily transferred to undamaged nearby schools, causing a sudden
increase in class sizes.
We could also argue that earthquakes do not affect test
scores (Y) in any other way apart from
via their effect on class sizes (X). They do not change the school syllabus,
they do not change the timing of the exams and they do not change the
underlying ability of students. Assuming this argument is correct, we are left
with a nice natural experiment: we have a random, isolated increase in class
sizes (X) that allows us to identify its independent contribution to test
scores (Y).
The Exogeneity Requirement: An Achilles Heel
Unfortunately, plausible instruments are difficult to identify.
The relevance requirement is fairly easy to meet: it is not difficult to find variables
that are correlated with the treatment (e.g. class sizes). However, the
exogeneity requirement represents a perennial stumbling block. In my previous
example, the exogeneity requirement is almost certainly not met. We can
theorise many potential effects of earthquakes on test scores besides its impact
on class sizes: for instance, earthquakes can have localised income effects as
well as psychological effects on affected students, both of which effect test
results. Alas, academic stardom will have to wait.
To make matters worse, the standards for an acceptable IV
have generally become a lot more stringent. This is partly due to the emergence
of new causal identification techniques that have raised the bar when it comes
to credible identification strategies. The graph below gives a nice
illustration of these new econometric techniques: some of which I may discuss
in future posts.
Source: The Economist
As a result, some of the most iconic IVs appear unable to
bear the weight of contemporary scrutiny. For example, perhaps the most famous
IV in all of Economics is the use of settler (colonial) mortality rates to
isolate the causal effect of institutions on growth (Acemoglu and Robinson,
2001). The authors theorised that in places where settler mortality rates were
high (e.g. Africa, South America), settlers were more likely set up
‘extractive’ institutions in which a small group of individuals exploited the
rest of the population. Conversely, in places where settler mortality rates
were low (e.g. North America), settlers were more likely to establish trade
relationships and ‘inclusive’ institutions, which included many people in the
process of governing.
The authors go on to argue that these initial conditions
greatly influenced the quality of economic and political arrangements today,
due to the ‘sticky’ nature of institutions. Hence, the relevance requirement is
met: Z (settler mortality rates) affect X (modern institutions). Then comes the
leap of faith: the authors claim that historic settler mortality rates (Z) have
not affected modern day growth (Y) except
via their effect on the institution’s they brought about (X).
Needless to say, the validity of this instrument has
received a hammering in recent years. For instance, settler mortality rates (Z)
seem correlated with historic disease environment which is itself correlated
with modern day growth (Y). These criticisms appear unanswerable: I remember a senior
faculty member at Oxford once telling me that the IV would probably not get
published today. Still, the above example serves as a testament to human
creativity and to the rapid advancement of scientific standards.
All Hope is Not Lost
With this being said, instrumental variables are not
completely defunct. One of the best IVs I have seen concerns estimating the
effect of incarceration length (X) on subsequent employment outcomes (Y)
(Kling, 2006). The immediate difficulty in estimating this effect is that
people who get longer prison sentences are likely to be different from those
who get shorter sentences, in ways that matter for future labour earnings.
To get around this, we can use the fact that judges are
randomly assigned to cases, and that some
judges appear systematically harsher
at sentencing than others. This appears to meet the conditions of a valid
instrument: the particular judge (Z) a defendant gets certainly effects his/her
likely incarceration length (X), but is unlikely to be correlated with any other factor which effects his/her
future earnings (Y).
This, I think, represents a plausible identification
strategy with many avenues for future research (see this blog post). Though
standards have risen, instrumental variables are certainly not obsolete. But
remember: IV regression is only as good as the instrument itself!
Word count: 1,000 words
References
- Acemoglu & Robinson (2001): ‘The Colonial Origins of Comparative Development: An Empirical Investigation’
- Kling (2006): ‘Incarceration Length, Employment, and Earnings’
- McKenzie has a nice recent blog post on the promise of judge leniency IV-designs: https://blogs.worldbank.org/impactevaluations/judge-leniency-iv-designs-now-not-just-crime-studies?cid=SHR_BlogSiteShare_XX_EXT