Session 5: Causation vs. Correlation — Thinking Like an Economist

Author

Affiliation

Claudius Gräbner-Radkowitsch

EUF and JKU

Published

21 05 2026

Modified

18 05 2026

Note

Date: Thursday, 21 May 2026, 16:00–19:00

Business question

Is there a gender wage gap among US workers — and what would it take to interpret a regression coefficient as evidence of discrimination?

Learning goals

Distinguish causal from associational claims — and articulate why the difference matters in business and policy contexts
Read and draw a simple DAG: identify confounders, mediators, and colliders
Explain the bad control problem: why controlling for a mediator answers a different question
Interpret a sequence of regression models on the gender wage gap given a stated causal model
Critically evaluate causal claims in business reports and media using DAG reasoning

Dataset

CPS earnings 2014 — US Current Population Survey, Monthly Outgoing Rotation Group.

Variable	Description
`wage`	Hourly wage (USD) = weekly earnings / usual weekly hours
`female`	1 = female, 0 = male
`age`	Age in years
`educ`	Educational attainment (CPS grade92 scale: 31–46)
`occ`	Occupation group (Census 2010 codes)

Source: Békés & Kézdi (2021). Full dataset documentation: https://gabors-data-analysis.com/datasets/cps-earnings.

Session outline

Take-home Task 2 debrief (~30 min)
Input: the identification question; DAGs; confounders, mediators, colliders
Live demo: three-model wage gap progression with DAG interpretation
Break
In-session exercise
Debrief + Quarto skill: structuring an analytical narrative

Materials

File	Description
Slides	Lecture slides — open in browser, press `F` for fullscreen
Live demo	Coding document built during the session — three models, DAG interpretation
Exercise	In-session exercise: interpreting the gender wage gap (via GitHub Classroom)
Exercise solution	Example solution (added after session)

Quarto skill introduced this session

Structuring an analytical narrative: how to move from exploratory output to a coherent written argument.

```{r}
#| label: tbl-models
#| tbl-cap: "Three models of the gender wage gap."

modelsummary(
  list("Raw gap" = m1, "+ Demographics" = m2, "+ Occupation" = m3),
  coef_map = c("female" = "Female"),
  stars = TRUE
)
```

Section headers signal the logical structure of your analysis — claim, evidence, interpretation, caveat
Callout boxes isolate key assumptions: {.callout-important} for critical caveats, {.callout-note} for context
Inline R values anchor written claims to actual output: `r round(coef(m1)["female"], 3)`