Understanding Causal Effects
Review and Summary of a 5-part Series
In this article, I review and summarize key points from a 5-part series on causal effects. While only a high-level contour is given here, further details and references can be found in the subsequent articles linked below. Additionally, each post has an associated video walkthrough on YouTube.
1) Introduction
What is a causal effect? Causal effects quantify the impact of a treatment by comparing outcomes of different treatment levels. This is synonymous with treatment effects, which are used interchangeably throughout this series. Put another way, causal effects go beyond questions like “What caused Y?” And aim to answer questions like “How much would a change in X impact Y?”
🔗 Article Link | YouTube Video
Background
When thinking about questions of cause and effect, it is helpful to distinguish 3 types of variables: outcomes, treatments, and covariates.
- Outcome: the variable we are ultimately interested in e.g. headache status
- Treatment: the variable we change in order to influence the outcome e.g. pill dosage
- Covariate: basically everything else e.g. age, weight, height, etc.
Potential Outcomes Framework
- This is an approach to estimating causal effects.
- The framework is built upon counterfactual (i.e. “what if?”) questions. For example, what if I didn’t take a Tylenol? Would my headache still have gone away?
- While we can never observe both scenarios (i.e. took pill and didn’t take pill scenarios), these what-if questions play a central in helping us formulate different types of causal effects.
3 Types of Causal Effects
- Individual Treatment Effects (ITE) — quantifies the impact of treatment for a particular individual.
- Average Treatment Effects (ATE) — estimates the expected impact of treatment for a population
- Average Treatment Effect for the Treated (ATT) — estimates expected treatment effect given the treatment was observed
2) Propensity Score-based Methods
Typically the best way to estimate causal effects are through interventional studies that employ randomization. However, sometimes such studies are not feasible. In such cases, we may want to estimate causal effects from observational data. One way we can do this is by using propensity scores.
🔗 Article Link | YouTube Video | Example Code
Observational vs Interventional Data
- Observational study — Passively measuring a system of interest without any intervention in the data-generating process
- Interventional study — Intentionally influencing data generating process for a particular goal e.g. Randomized Controlled Trials (RCTs).
- Key point: When using observational data, we must look out for systematic differences between treatment and control groups. A key tool central to interventional studies that helps combat this issue is randomization.
What are propensity scores?
- A propensity score estimates the probability a subject receives treatment based on other characteristics
- We can use propensity score-based methods to remove bias due to systematic differences between sub-populations based on observed covariates
- A common way to estimate propensity score is via Logistic Regression
3 Propensity Score-based approaches
- 1:1 Matching — create treated-control pairs with similar propensity scores
- Stratification — split subjects into groups with similar propensity scores
- Inverse probability of treatment weighting (IPTW) — use propensity scores to derive weights for each subject which can be used to compute treatment effects directly
3) The do-operator
While propensity score-based methods help estimate unbiased causal effects in the face of observed confounders, they do not account for unmeasured confounders. In these cases, we can turn to Judea Pearl’s Structural Causal Framework. This allows us to cope with unmeasured confounders and more, given we construct a causal model (i.e. a DAG). Additionally, in this framework, we can reformulate causal effects using something called the do-operator.
🔗 Article Link | YouTube Video
What is the do-operator?
- The do-operator is a mathematical representation of an intervention
- We can use the do-operator to generalize our definition of the Average Treatment Effect
- This is a central tool in Judea Pearl’s Structural Causal framework
Observational vs Interventional Distributions
- Observational distribution — Distribution that does not contain do-operator
- Interventional distribution — Distribution that contains the do-operator
Identifiability
- Answering the question, “Can the causal effect be obtained from the given data?”
- This can mean using data from an interventional study to estimate causal effects
- Or, estimating causal effects using observational data with some additional assumptions
4) DAGs and Graphical Criteria
Continuing the exploration of Pearl’s Structural Causal Framework, here we discussed a general treatment of identifiability i.e. answering, “Can the causal effect be obtained from the given data?”. Generally, these can be answered using the 3 Rules of Do-calculus. However, these can prove to be algebraically demanding. Before resorting to these rules, we can use 2 quick-and-easy graphical criteria to evaluate identifiability.
🔗 Article Link | YouTube Video
What’s a DAG?
- DAG is short for Directed Acyclic graph, which is a graph where the edges have arrowheads, and the arrowheads do not form any cycles
- DAGs can be used to represent Markovian Causal Models whose causal effects are always identifiable, meaning the causal effect can be obtained from the given data
3 Rules of Do-calculus
- A complete set of rules we can use to manipulate interventional distributions
- In other words, a set of operations we can use to answer any question of identifiability
Graphical Criteria
- The Back Door Criterion and Front Door Criterion are 2 quick-and-easy tests we can apply to a DAG to evaluate identifiability
- Sufficient Sets tell us which variables we need to measure to calculate unbiased causal effects between X and Y.
5) Regression-based Methods
An alternative flavor of causal effect estimation to Pearl’s Structural Causal Framework are regression-based methods. These methods employ regression techniques to quantify causal effects from data. Here we discussed 3 popular ways to do this.
🔗 Article Link | YouTube Video | Example Code
What is regression?
- Regression is a way to learn the relationships between variables using data
3 Popular Regression-based approaches.
- Linear Regression: we define the causal effect as the coefficient for the treatment variable in the regression model estimating the outcome.
- Double Machine Learning: we estimate the causal effect using a 3-step process involving 2 machine learning models. Note: there is no restriction on the ML models used
- Meta-learners: we use regression models to simulate unobserved outcomes and estimate causal effects. 3 types: T-learner, S-leaner, and X-learner.
More on Causality
Resources
Connect: My website | Book a call | Ask me anything
Socials: YouTube 🎥 | LinkedIn | Twitter
Support: Buy me a coffee ☕️