Causality
For the most part, empirical Economics (or ‘Econometrics’)
is concerned with identifying causal relationships in the social world. We
often think of causation in terms of the possibility of manipulation. For
instance, X can be said to cause Y if when we manipulate X, Y responds in the
predicted fashion.
In Economics, we propose causal relationships via
hypotheses, which are tested using data and standardised statistical
techniques. This deference to an objective scientific method is used to justify
Economics’ label as a social ‘science’. It is also used to distinguish
Economics from humanities subjects like History, which may speculate causal
relationships via theory but make little attempt to validate them in a way that
is both objective and generalisable.
This is not to say that Economics is the same as Physics or
Chemistry. Of course, physicists and chemists share economists’ commitment to
making meaningful causal statements about the world. However, in dealing with
more fundamental and discernible laws of nature, natural scientists are able to
make near infallible ‘if – then’ statements such as ‘if water reaches 0°C, then
it will turn to ice’. Moreover, since their objects of interest are atoms
rather than humans, scientists can conduct carefully controlled laboratory experiments
to make causal identification easier and more replicable.
In contrast, economists study a messy social world containing
a maelstrom of different actors, all of whom interact and react with one
another in chaotic and unpredictable ways. For every cause we propose to
explain an observed outcome there are myriad other potential causes pushing in
the same and opposite directions. Isolating the sole impact of our speculated
cause can be very difficult, which makes it a lot harder to make robust causal
statements.
Nonetheless, Economists have a variety of tools to help do
this. To understand these, a useful starting point is to distinguish between
correlation and causation.
Correlation ≠ Causation
Correlation ≠ causation. To see this, suppose I collect data
on textbooks per classroom and mean exam scores for 30 schools. The scatterplot
might look something like this:
Source: Made up figures
Can we conclude from this that more textbooks cause higher exam scores? No – it may
simply be that schools with more textbooks also have lower student-teacher-ratios
(STRs), better classrooms and more engaged teachers – all of which could also
improve exam performance. We need some way to control for these confounding
factors.
Randomised Controlled Trials
Perhaps the simplest way to do this is to conduct a
Randomised Controlled Trial (RCT). The basic idea is to take a sample of
schools and randomly split them into a treatment and control group. We split
the groups randomly to help ensure they are ‘balanced’, meaning the treatment
and control group are as similar as possible in terms of STRS, desks, and any
other confounding variables we can observe. If the two groups are imbalanced,
we can always re-randomise until balance is achieved.
The treatment group is then ‘treated’ while the control
group is left untreated (or given a placebo). In this instance, a treatment may
be to provide schools with free textbooks for each child. After giving the
treatment time to take effect, we then examine children in both groups to test
for a ‘treatment effect’. The outcome of interest is the difference in exam
score between groups: since the two groups were as similar as possible ex ante,
we can assume any difference in exam performance post-treatment is caused by the treatment itself.
The diagram below may help conceptualise this:
Source: Sydney Morning
Herald http://www.smh.com.au/
This is an RCT in its simplest form. We can add twists to
this basic framework: for example, if we are worried about baseline imbalance,
we can examine both groups before the
intervention and then again at the end of the treatment window. We then test to
see if there is a ‘difference-in-difference’ between the two groups, which
effectively nets out any ex ante differences between the two.
RCTs take their intellectual heritage from the field of medical
science. When testing a new drug, pharmaceutical companies compare the outcomes
of a treatment vs. a control group made to be as similar as possible ex ante.
While the first published RCT in medicine took place in 1948, it took until the
1990s for RCTs to become popular in Economics. In just 20 years, RCTs have gone
from being a novel econometric technique to being considered the ‘gold standard’
in conducting causal work in social science.
Strengths
There are a number of reasons for this. Most importantly, by
controlling for confounding factors at the outset, we get arguably the
‘cleanest’ identification of a causal relationship between X and Y. For me,
this is where Economics most lives up to its social science moniker. If we
conduct the same RCT in a variety of contexts and get the same result each
time, this, I think, is the closest we can get to objective scientific
knowledge about the social world. Owing to the simplicity of their methodology,
the results from RCTs are also easy to communicate, aiding effective
policymaking.
Limitations
However, RCTs are also expensive, time-consuming and often raise
a host of ethical questions. Also, as my friend Hannah comments on my first
post, there have been difficulties in scaling up the findings from RCTs to
meaningful policy change – an issue I discuss further in my next post. These points
are all important; but for me the most compelling limitation of RCTs is their
inability to answer the really big questions. For example, consider the claim
that education causes people to earn higher wages. We cannot, practically or
morally speaking, randomly allocate different amounts of education to children.
In the same way, we cannot uncover the effect of education on growth by
randomising education delivery across different countries. To this extent, it
is best to view RCTs as a powerful tool with a well-defined purpose, rather
than a ‘silver bullet’ which solves all the issues of causal identification.
[990 words]
References
To learn more about RCTs, a good place to start is ‘Running
Randomised Evaluations: A Practical Guide’ by Rachel Glennerster and Kudzai
Takavarasha.
In my opinion, there is a gap in the market for a book that
explains the principles of econometrics in an intuitive and engaging fashion
(the best I can think of is Mostly Harmless Econometrics, but even that gets
quite technical). If you know of any, please let me know.
Awesome stuff. Especially the sourced numbers. Esther Duflo does some really interesting things Helena introduced me to... another complication is sifting through the huge datasets the trials produce and writing the code to do this! + data quality in often poorly-controlled environments used. Keep it up!
ReplyDelete