
Causality
Introduction to Causal Inference
Divisions of Biostatistics & Epidemiology
UC Berkeley
Summary
This course presents a general framework for causal inference. Directed acyclic graphs and non-parametric structural equation models (NPSEM) are used to define the causal model. Target causal parameters are defined using counterfactuals and marginal structural models. G-computation estimators, inverse probability weighted estimators, and targeted maximum likelihood estimators are introduced. Non-parametric and semi-parametric approaches to nuisance parameter estimation, with an emphasis on Super Learning, are presented. Students gain practical experience implementing these estimators and interpreting results through discussion assignments, R labs, and R assignments.
Course Learning Objectives
By the end of this course, students should be able to
1) Translate a scientific question and background knowledge into a causal model and target causal parameter using the Structural Causal Model (SCM)/counterfactual frameworks.
2) Assess identifiability of the target causal parameter and express it as a parameter of the observed data distribution.
3) Understand the challenge posed by the curse of dimensionality; be familiar with and able to apply data adaptive (machine learning) approaches.
4) Understand the properties of and be able to apply three classes of causal effect estimators.
5) Begin to develop familiarity with the uses of a formal causal framework for investigating a wide range of questions about the world works.
Part I: From causal questions to the statistical estimation problem
Lecture 1: A General Roadmap for Tackling Causal Questions
Lecture 2: Pearl’s Structural Causal Model (SCM)
Lecture 3: Defining Target Causal Quantities: Link between SCM and Counterfactuals
Lecture 4: Defining the Observed Data and its link to the SCM
Lecture 5: Identifying Causal Effects
Part II : Statistical estimation and interpretation
Lecture 6: Introduction to Estimation
Lecture 7: Introduction to Data-Adaptive Estimation and Super Learning
Lecture 8: Estimation of Causal Effects with Data-Adaptive Methods
Lecture 9: The Propensity Score and Inverse Probability of Treatment Weighting (IPTW)
Lecture 10: IPTW for Marginal Structural Model (MSM) parameters
Lecture 11: Introduction to Targeted Maximum Likelihood Estimation (TMLE)
Lecture 12: Interpretation, Wrap up and Where Next?
Discussion Assignments:
Assignment 1: For two redacted real studies, apply the first steps of the roadmap to (i) specify the scientific question, (ii) represent knowledge with a SCM, and (iii) specify the target causal parameter.
Assignment 2: For the same studies, specify the observed data, assess identifiability, specify the statistical estimand, and discuss the needed positivity assumption.
R Labs & Corresponding Homework:
Lab & Hw 1: Defining the causal parameter and introduction to simulations in R
Lab & Hw 2: Identifiability, linking the observed data to the causal model, and implementation of the simple substitution estimator based on the G-computation formula
Lab & Hw 3: Cross-validation and data-adaptive methods for prediction
Lab & Hw 4: Inverse probability of treatment weighting (IPTW) estimators and the impact of positivity violations
Lab 5: Targeted maximum likelihood estimation (TMLE)
Lab 6: Inference with the non-parametric bootstrap and with influence curves for TMLE
Final Project:
Fully apply each step of the causal roadmap to a real-world problem
Suggested background readings for each topic/section of the course are provided. Helpful references are also provided at appropriate points in the lecture slides. Please note that the listed references are NOT intended as a complete bibliography, but only as helpful entry points to the material.