Session 5: Causation vs. Correlation — Thinking Like an Economist

Published

21 05 2026

Modified

18 05 2026

Note

Date: Thursday, 21 May 2026, 16:00–19:00

Business question

Is there a gender wage gap among US workers — and what would it take to interpret a regression coefficient as evidence of discrimination?

Learning goals

  • Distinguish causal from associational claims — and articulate why the difference matters in business and policy contexts
  • Read and draw a simple DAG: identify confounders, mediators, and colliders
  • Explain the bad control problem: why controlling for a mediator answers a different question
  • Interpret a sequence of regression models on the gender wage gap given a stated causal model
  • Critically evaluate causal claims in business reports and media using DAG reasoning

Dataset

CPS earnings 2014 — US Current Population Survey, Monthly Outgoing Rotation Group.

Variable Description
wage Hourly wage (USD) = weekly earnings / usual weekly hours
female 1 = female, 0 = male
age Age in years
educ Educational attainment (CPS grade92 scale: 31–46)
occ Occupation group (Census 2010 codes)

Source: Békés & Kézdi (2021). Full dataset documentation: https://gabors-data-analysis.com/datasets/cps-earnings.

Session outline

  • Take-home Task 2 debrief (~30 min)
  • Input: the identification question; DAGs; confounders, mediators, colliders
  • Live demo: three-model wage gap progression with DAG interpretation
  • Break
  • In-session exercise
  • Debrief + Quarto skill: structuring an analytical narrative

Materials

File Description
Slides Lecture slides — open in browser, press F for fullscreen
Live demo Coding document built during the session — three models, DAG interpretation
Exercise In-session exercise: interpreting the gender wage gap (via GitHub Classroom)
Exercise solution Example solution (added after session)

Quarto skill introduced this session

Structuring an analytical narrative: how to move from exploratory output to a coherent written argument.

```{r}
#| label: tbl-models
#| tbl-cap: "Three models of the gender wage gap."

modelsummary(
  list("Raw gap" = m1, "+ Demographics" = m2, "+ Occupation" = m3),
  coef_map = c("female" = "Female"),
  stars = TRUE
)
```
  • Section headers signal the logical structure of your analysis — claim, evidence, interpretation, caveat
  • Callout boxes isolate key assumptions: {.callout-important} for critical caveats, {.callout-note} for context
  • Inline R values anchor written claims to actual output: `r round(coef(m1)["female"], 3)`