Session 5: Causation vs. Correlation — Thinking Like an Economist
Note
Date: Thursday, 21 May 2026, 16:00–19:00
Business question
Is there a gender wage gap among US workers — and what would it take to interpret a regression coefficient as evidence of discrimination?
Learning goals
- Distinguish causal from associational claims — and articulate why the difference matters in business and policy contexts
- Read and draw a simple DAG: identify confounders, mediators, and colliders
- Explain the bad control problem: why controlling for a mediator answers a different question
- Interpret a sequence of regression models on the gender wage gap given a stated causal model
- Critically evaluate causal claims in business reports and media using DAG reasoning
Dataset
CPS earnings 2014 — US Current Population Survey, Monthly Outgoing Rotation Group.
| Variable | Description |
|---|---|
wage |
Hourly wage (USD) = weekly earnings / usual weekly hours |
female |
1 = female, 0 = male |
age |
Age in years |
educ |
Educational attainment (CPS grade92 scale: 31–46) |
occ |
Occupation group (Census 2010 codes) |
Source: Békés & Kézdi (2021). Full dataset documentation: https://gabors-data-analysis.com/datasets/cps-earnings.
Session outline
- Take-home Task 2 debrief (~30 min)
- Input: the identification question; DAGs; confounders, mediators, colliders
- Live demo: three-model wage gap progression with DAG interpretation
- Break
- In-session exercise
- Debrief + Quarto skill: structuring an analytical narrative
Materials
| File | Description |
|---|---|
| Slides | Lecture slides — open in browser, press F for fullscreen |
| Live demo | Coding document built during the session — three models, DAG interpretation |
| Exercise | In-session exercise: interpreting the gender wage gap (via GitHub Classroom) |
| Exercise solution | Example solution (added after session) |
Quarto skill introduced this session
Structuring an analytical narrative: how to move from exploratory output to a coherent written argument.
```{r}
#| label: tbl-models
#| tbl-cap: "Three models of the gender wage gap."
modelsummary(
list("Raw gap" = m1, "+ Demographics" = m2, "+ Occupation" = m3),
coef_map = c("female" = "Female"),
stars = TRUE
)
```- Section headers signal the logical structure of your analysis — claim, evidence, interpretation, caveat
- Callout boxes isolate key assumptions:
{.callout-important}for critical caveats,{.callout-note}for context - Inline R values anchor written claims to actual output:
`r round(coef(m1)["female"], 3)`