Session 6: Panel Data and Fixed Effects — Live Demo

Advanced Data Science · Europa-Universität Flensburg

Author

Claudius Gräbner-Radkowitsch

Published

28 05 2026

1 Setup

1.1 Business question

Does rising income make countries happier — and what would it take to interpret that relationship causally?

This question has a deceptively simple answer in cross-sectional data (yes, richer countries are happier) and a much less clear answer when we follow countries over time (the Easterlin paradox). Panel data methods let us be precise about which comparison we are making and what we can credibly claim.

1.2 Packages and data

Code
here::i_am("content/material/session06/lecture_notes.qmd")
library(here)
library(tidyverse)
library(fixest)
library(modelsummary)
library(flextable)
library(patchwork)

dat <- read_csv(
  here("content/material/session06/data/happiness_income.csv"), 
  show_col_types = FALSE)
glimpse(dat)
Rows: 1,869
Columns: 10
$ country      <chr> "Afghanistan", "Afghanistan", "Afghanistan", "Afghanistan…
$ iso2c        <chr> "AF", "AF", "AF", "AF", "AF", "AF", "AF", "AF", "AF", "AF…
$ iso3c        <chr> "AFG", "AFG", "AFG", "AFG", "AFG", "AFG", "AFG", "AFG", "…
$ year         <dbl> 2011, 2012, 2014, 2015, 2016, 2017, 2018, 2019, 2020, 202…
$ GDP_ppp      <dbl> 80913174278, 91231454558, 98965950295, 100402257641, 1026…
$ population   <dbl> 29347708, 30560034, 32792523, 33831764, 34700612, 3568893…
$ unemployment <dbl> 7.830, 7.875, 7.915, 9.032, 10.116, 11.184, 11.192, 11.18…
$ GDP_pc       <dbl> 2757.053, 2985.319, 3017.943, 2967.692, 2958.785, 2952.99…
$ happiness    <dbl> 4.2580, 4.0400, 3.5750, 3.3600, 3.7940, 3.6320, 3.2030, 2…
$ gini_disp    <dbl> 31.7, 31.8, 31.9, 31.9, 31.9, 31.9, NA, NA, NA, NA, NA, N…

The dataset merges three sources:

Variable Source Description
happiness World Happiness Report 2026 Life satisfaction (0–10, 3-year average)
GDP_pc World Bank WDI GDP per capita, PPP (constant 2017 int’l $)
gini_disp SWIID 9.2 Disposable income Gini coefficient
unemployment World Bank WDI Unemployment rate (% of labour force)
population World Bank WDI Total population

2 Exploring the Data

2.1 Panel structure

Code
dat |>
  summarise(
    n_countries = n_distinct(country),
    n_years     = n_distinct(year),
    year_min    = min(year),
    year_max    = max(year),
    n_obs       = n()
  ) |> 
  flextable() |> 
  theme_vanilla() |> 
  autofit()

n_countries

n_years

year_min

year_max

n_obs

146

14

2,011

2,025

1,869

Code
dat |>
  count(country) |>
  count(n, name = "n_countries") |>
  rename(years_observed = n)|> 
  flextable() |> 
  theme_vanilla() |> 
  autofit()

years_observed

n_countries

1

2

2

1

3

2

4

2

5

2

6

3

7

2

9

2

10

4

11

1

12

2

13

6

14

117

The panel is unbalanced — not every country is observed in every year. This is typical for survey-based cross-country data and causes no problems for fixed effects estimation.

2.2 Cross-country variation in happiness

Before running any regression, it helps to see the structure of the data. How much of the variation in happiness is between countries vs. within countries over time?

Code
dat |>
  group_by(country) |>
  summarise(
    happiness = mean(happiness, na.rm = TRUE),
    log_gdp   = mean(log(GDP_pc), na.rm = TRUE),
    .groups   = "drop"
  ) |>
  filter(!is.na(happiness), !is.na(log_gdp)) |>
  ggplot(aes(log_gdp, happiness)) +
  geom_point(colour = "#2C5F8A", alpha = 0.7, size = 2) +
  geom_smooth(method = "lm", colour = "#D04A2F", se = TRUE,
              fill = "#D04A2F", alpha = 0.12, linewidth = 1) +
  labs(
    x = "Log GDP per capita (PPP), country mean",
    y = "Life satisfaction (0–10), country mean"
  ) +
  theme_minimal(base_size = 13)
Figure 1: Average happiness and log GDP per capita by country (full-period means). Richer countries are clearly happier in cross-section.

The cross-sectional correlation is strong and positive. But is this evidence that income causes happiness?

3 The Model Progression

3.1 Step 0 — Pooled OLS: the naive baseline

Pool all country-year observations and regress happiness on log income, ignoring panel structure entirely.

Code
m0 <- feols(happiness ~ log(GDP_pc), data = dat)

modelsummary(
  list("Pooled OLS" = m0),
  coef_map  = c("log(GDP_pc)" = "Log GDP per capita"),
  statistic = "conf.int", conf_level = 0.95,
  gof_map   = c("nobs", "r.squared"),
  stars     = TRUE,
  output    = "flextable"
) |> autofit()

Pooled OLS

Log GDP per capita

0.813***

[0.786, 0.840]

Num.Obs.

1729

R2

0.672

+ p < 0.1, * p < 0.05, ** p < 0.01, *** p < 0.001

The large positive coefficient is not surprising — it reflects the strong cross-sectional pattern we already saw. But it conflates between-country differences (richer country = happier country) with within-country change (rising income → rising happiness). The former is dominated by stable country characteristics that we have not controlled for.

3.2 The composite error — why pooled OLS is biased

The pooled OLS error \(\epsilon_{it}\) has two components:

\[\epsilon_{it} = \eta_i + v_{it}\]

  • \(\eta_i\): everything stable about country \(i\) that we do not observe — culture, institutions, social trust, history
  • \(v_{it}\): genuine year-to-year noise

If \(\eta_i\) correlates with \(\log(\text{GDP}_{it})\) — and it clearly does, since rich institutional environments produce both high income and high happiness — pooled OLS is biased. This is OVB with an unobserved variable.

3.3 Step 1 — Country fixed effects: removing stable confounders

Country FE absorbs \(\eta_i\) entirely by subtracting each country’s time-mean from every variable:

\[\tilde{y}_{it} = y_{it} - \bar{y}_i \qquad \tilde{x}_{it} = x_{it} - \bar{x}_i\]

Because \(\eta_i - \bar{\eta}_i = 0\), the stable country component disappears. The coefficient is now identified from within-country variation only: when a country’s income rises above its own historical average, does its happiness rise too?

Code
m1 <- feols(happiness ~ log(GDP_pc) | country,
            vcov = ~country,
            data = dat)

modelsummary(
  list("Country FE" = m1),
  coef_map  = c("log(GDP_pc)" = "Log GDP per capita"),
  statistic = "conf.int", conf_level = 0.95,
  gof_map   = c("nobs", "r.squared", "r2.within"),
  stars     = TRUE,
  output    = "flextable"
) |> autofit()

Country FE

Log GDP per capita

1.259***

[0.829, 1.688]

Num.Obs.

1727

R2

0.928

R2 Within

0.150

+ p < 0.1, * p < 0.05, ** p < 0.01, *** p < 0.001

Contrary to a naive expectation, the country FE coefficient is larger than in pooled OLS — see the full discussion below.

3.3.1 Visualising the within-transformation

To build intuition, let us manually demean one slice of the data and overlay the within-country slopes on the raw scatter.

Code
within_dat <- dat |>
  filter(country %in% highlight) |>
  group_by(country) |>
  mutate(
    happiness_dm = happiness   - mean(happiness,    na.rm = TRUE),
    log_gdp_dm   = log(GDP_pc) - mean(log(GDP_pc),  na.rm = TRUE)
  ) |>
  ungroup()

p_raw <- dat |>
  filter(country %in% highlight) |>
  ggplot(aes(log(GDP_pc), happiness)) +
  geom_point(colour = "grey70", alpha = 0.4, size = 1.5) +
  geom_smooth(method = "lm", colour = "#D04A2F", se = FALSE,
              linewidth = 0.8, linetype = "dashed") +
  geom_smooth(aes(colour = country), method = "lm", se = FALSE, linewidth = 1) +
  scale_colour_manual(values = col_hl, guide = "none") +
  labs(x = "Log GDP per capita", y = "Happiness",
       title = "Raw data", subtitle = "Dashed = pooled OLS") +
  theme_minimal(base_size = 11)

p_demeaned <- within_dat |>
  ggplot(aes(log_gdp_dm, happiness_dm)) +
  geom_point(aes(colour = country), alpha = 0.6, size = 1.5) +
  geom_smooth(method = "lm", colour = "#D04A2F", se = FALSE, linewidth = 1) +
  scale_colour_manual(values = col_hl, name = NULL) +
  labs(x = "Log GDP (demeaned)", y = "Happiness (demeaned)",
       title = "Demeaned data", subtitle = "Within-country variation only") +
  theme_minimal(base_size = 11) +
  theme(legend.position = "bottom")

p_raw + p_demeaned
Figure 3: Raw data vs. demeaned data for selected countries. The grey overall trend is dominated by between-country differences; coloured lines show within-country slopes only.

The between-country slope (dashed, left panel) is steep. The within-country slope (right panel) is flatter — in some countries barely positive. This is the Easterlin paradox captured in a single figure.

3.4 Step 2 — Two-way fixed effects: controlling for global time shocks

Country FE removes stable country differences. But what about shocks that hit all countries in the same year — a global recession, a pandemic, a commodity price surge? If such shocks correlate with income changes, country-only FE is not enough.

Two-way FE adds a separate intercept for each year:

\[y_{it} = \alpha_i + \gamma_t + \beta \log(\text{GDP}_{it}) + v_{it}\]

\(\hat\beta\) is now identified from variation that is neither explained by which country nor which year.

Code
m2 <- feols(happiness ~ log(GDP_pc) | country + year,
            vcov = ~country,
            data = dat)

modelsummary(
  list("Two-way FE" = m2),
  coef_map  = c("log(GDP_pc)" = "Log GDP per capita"),
  statistic = "conf.int", conf_level = 0.95,
  gof_map   = c("nobs", "r.squared", "r2.within"),
  stars     = TRUE,
  output    = "flextable"
) |> autofit()

Two-way FE

Log GDP per capita

1.554***

[0.943, 2.164]

Num.Obs.

1727

R2

0.930

R2 Within

0.142

+ p < 0.1, * p < 0.05, ** p < 0.01, *** p < 0.001

3.5 The standard progression side by side

Code
modelsummary(
  list("Pooled OLS" = m0, "Country FE" = m1, "Two-way FE" = m2),
  coef_map  = c("log(GDP_pc)" = "Log GDP per capita"),
  statistic = "conf.int", conf_level = 0.95,
  gof_map   = c("nobs", "r.squared", "r2.within"),
  stars     = TRUE,
  notes     = "95% CIs in brackets. Pooled OLS: HC1 SEs. FE models: clustered at country level.",
  output    = "flextable"
) |> autofit()
Table 1: Three models of the income–happiness relationship. 95% confidence intervals in brackets. Within-R² is the relevant fit statistic for FE models.

Pooled OLS

Country FE

Two-way FE

Log GDP per capita

0.813***

1.259***

1.554***

[0.786, 0.840]

[0.829, 1.688]

[0.943, 2.164]

Num.Obs.

1729

1727

1727

R2

0.672

0.928

0.930

R2 Within

0.150

0.142

+ p < 0.1, * p < 0.05, ** p < 0.01, *** p < 0.001

95% CIs in brackets. Pooled OLS: HC1 SEs. FE models: clustered at country level.

Reading the table:

  • Pooled OLS → Country FE: The coefficient increases from 0.81 to 1.26. This is counter-intuitive but real: the within-country income–happiness relationship is stronger than the cross-sectional comparison between countries. Countries going through active income growth (e.g. emerging markets) show larger happiness gains than the gap between a rich and a poor country at one point in time would suggest. The cross-sectional estimate is compressed partly by diminishing returns at high income levels and by stable between-country differences in happiness baselines.
  • Country FE → Two-way FE: The coefficient increases further to 1.55. Year FE removes global shocks that simultaneously raised incomes and happiness everywhere (good global years); after removing those, the within-country, within-year income–happiness signal is stronger.
  • Within-R² (0.15–0.14): income explains roughly 15–14% of within-country happiness variation — meaningful but modest. Happiness has many determinants beyond income.

4 Causal Interpretation

4.1 What the FE estimate claims — and does not claim

The two-way FE coefficient answers:

When a country’s income rises above its own historical average in a year that is not unusually good or bad for all countries, does its happiness tend to rise?

Country FE already controls for:

  • All time-invariant country characteristics: stable institutions, culture, geography, social trust, historical development path
  • This includes most of what makes Scandinavian countries both rich and happy — it is absorbed into \(\alpha_i\)

Year FE already controls for:

  • Global shocks affecting all countries in a given year: financial crises, pandemics, commodity cycles

What is still potentially uncontrolled:

Time-varying factors that change differently across countries and independently affect both income and happiness. Two obvious candidates in our data: income inequality and unemployment.

4.2 Three estimands — precision about what you want to estimate

Before adding more controls, it is worth being precise about which causal quantity you are actually after.

Estimand What you condition on What it answers
Total effect of income on happiness Nothing on the causal path “Does richer = happier, all channels included?”
Controlled direct effect (CDE) Mediators held fixed (e.g. healthcare) “Does income matter beyond health, trust, etc.?”
Indirect / path-specific effect via M Estimated via mediation analysis “How much of the income effect runs through health?”

Our main estimate (m2) is an attempt at the total within-country effect — it includes all pathways through which rising income may raise happiness: better healthcare, reduced stress, more autonomy, improved public services, and any direct effect.

4.3 A common temptation: controlling for WHR variables

The WHR reports several variables alongside happiness: social support, healthy life expectancy, freedom to make life choices, perceptions of corruption. The temptation is to add these as controls to “isolate the direct income effect” or “remove confounding.”

WarningThis is the bad-control problem again

Most of these WHR variables are mediators, not confounders:

  • Healthy life expectancy: income → health → happiness
  • Social support: income enables social networks → happiness
  • Freedom to choose: affluence enables autonomy → happiness

Conditioning on a mediator does two things simultaneously:

  1. It removes part of the income effect (blocking the mediated path)
  2. If the mediator also has independent determinants (culture, geography), it opens new backdoor paths — introducing bias rather than removing it

The result is a controlled direct effect that answers a different question than the total effect — and does so with additional bias from the newly opened paths. This is precisely the “bad control” problem from Session 5, applied at the macro level.

4.4 Stage 1 — Total effect with time-varying confounders

The appropriate additions are variables that independently affect both income and happiness and change over time within countries, without lying on the causal path from income to happiness.

Income inequality (gini_disp): The US Easterlin story is partly about rising inequality — aggregate GDP grew, but most of the gains went to the top, so median incomes stagnated. Gini is a time-varying confounder (or effect modifier): it captures distributional changes that affect happiness independently of average income.

Unemployment (unemployment): Labour market conditions affect wellbeing through channels (job security, social belonging) that are partly independent of aggregate income.

Code
m3a <- feols(happiness ~ log(GDP_pc) + gini_disp               | country + year,
             vcov = ~country, data = dat)

m3b <- feols(happiness ~ log(GDP_pc) + unemployment            | country + year,
             vcov = ~country, data = dat)

m3  <- feols(happiness ~ log(GDP_pc) + gini_disp + unemployment | country + year,
             vcov = ~country, data = dat)
Code
modelsummary(
  list(
    "Two-way FE"    = m2,
    "+ Gini"        = m3a,
    "+ Unemployment"= m3b,
    "+ Both"        = m3
  ),
  coef_map = c(
    "log(GDP_pc)"  = "Log GDP per capita",
    "gini_disp"    = "Gini (disposable income)",
    "unemployment" = "Unemployment rate"
  ),
  statistic = "conf.int", conf_level = 0.95,
  gof_map   = c("nobs", "r.squared", "r2.within"),
  stars     = TRUE,
  notes     = "Two-way FE (country + year) throughout. SEs clustered at country level.",
  output    = "flextable"
) |> autofit()
Table 2: Adding time-varying confounders to the two-way FE baseline. 95% confidence intervals in brackets. Note: the sample shrinks when Gini is included (missing data).

Two-way FE

+ Gini

+ Unemployment

+ Both

Log GDP per capita

1.554***

1.740***

1.217***

1.376***

[0.943, 2.164]

[1.044, 2.436]

[0.602, 1.833]

[0.666, 2.086]

Gini (disposable income)

-0.021

-0.026

[-0.057, 0.016]

[-0.059, 0.008]

Unemployment rate

-0.042***

-0.036***

[-0.061, -0.024]

[-0.055, -0.017]

Num.Obs.

1727

1308

1711

1297

R2

0.930

0.942

0.935

0.945

R2 Within

0.142

0.156

0.186

0.194

+ p < 0.1, * p < 0.05, ** p < 0.01, *** p < 0.001

Two-way FE (country + year) throughout. SEs clustered at country level.

NoteInterpreting the results

The income coefficient remains large and significant throughout (1.55 → 1.38 in the full model, 95% CI well above zero). The key findings from each column:

  • + Gini only: The Gini coefficient is negative but not statistically significant at conventional levels. The sample shrinks substantially due to missing Gini observations — results should be interpreted cautiously.
  • + Unemployment only: Unemployment is negative and highly significant — higher unemployment reduces happiness within countries even after controlling for income and fixed effects. Adding unemployment leaves the income coefficient largely unchanged.
  • + Both: The income coefficient falls modestly (11% relative to the baseline). Most of the action comes from unemployment, not Gini. The income–happiness relationship is robust to these time-varying controls.

A note on sample composition: adding Gini reduces the sample by 419 observations (from 1727 to 1308). Part of the income coefficient shift when Gini is included may reflect this sample restriction rather than the Gini control itself. Compare columns (1) and (3) — the unemployment-only model uses nearly the same sample as the baseline — to isolate the cleaner comparison.

4.5 Stage 2 — Why mediation analysis, not more controls

The concern “maybe it is not income but good healthcare that makes people happier” is a question about indirect effects: how much of the total income–happiness relationship runs through healthcare?

That is a legitimate question — but it requires a different estimand than what we have been computing, and a different analytical strategy.

The wrong approach: add healthcare (or WHR healthy life expectancy) to the main regression. As explained above, this gives the controlled direct effect, introduces mediator bias, and does not actually answer the indirect effect question.

The right approach — causal mediation decomposition:

\[\underbrace{\hat\beta_{\text{total}}}_{\text{Total effect}} = \underbrace{\hat\beta_{\text{direct}}}_{\text{Income → Happiness, not via M}} + \underbrace{\hat\beta_{\text{indirect}}}_{\text{Income → M → Happiness}}\]

This requires:

  1. A model for the mediator (e.g., healthy life expectancy ~ income + country FE + year FE + confounders)
  2. A model for happiness including both income and the mediator
  3. Strong identification assumptions — crucially, no unmeasured confounders of the mediator–outcome relationship, which is demanding at the macro level

The practical upshot: if a mediation analysis found a large indirect effect through healthcare (i.e., most of the income effect runs through better health), that would support the view that policies improving health without income growth could also raise happiness — a different policy conclusion than if the income effect were mostly direct.

NoteFurther reading

The World Happiness Report investigates mediation pathways in its statistical appendices — see the methods notes in recent editions for their approach to decomposing the cross-country happiness gaps. Note that findings on the relative importance of income vs. social variables differ across WHR editions and decomposition methods, and these are cross-sectional decompositions, not panel FE estimates.

Mediation analysis is beyond the scope of this session, but if you pursue this line of research, the mediation package in R implements the Imai et al. framework for causal mediation under the sequential ignorability assumption.


5 Reporting Panel Results in Quarto

5.1 What to include in every panel table

Code
modelsummary(
  list(
    "(1) Pooled OLS"      = m0,
    "(2) Country FE"      = m1,
    "(3) Two-way FE"      = m2,
    "(4) + Gini & Unemp." = m3
  ),
  coef_map  = c(
    "log(GDP_pc)"  = "Log GDP per capita",
    "gini_disp"    = "Gini coefficient",
    "unemployment" = "Unemployment rate"
  ),
  statistic = "conf.int", conf_level = 0.95,
  gof_map   = c("nobs", "r.squared", "r2.within"),
  stars     = TRUE,
  notes     = c(
    "95% CIs in brackets.",
    "Column (1): HC1 standard errors.",
    "Columns (2)–(4): clustered at country level.",
    "Country and year fixed effects in columns (3) and (4)."
  ),
  output    = "flextable"
) |> autofit()
Table 3: Income and happiness: pooled OLS, country FE, two-way FE, and two-way FE with time-varying controls. 95% confidence intervals in brackets. Preferred specification: column (4).

(1) Pooled OLS

(2) Country FE

(3) Two-way FE

(4) + Gini & Unemp.

Log GDP per capita

0.813***

1.259***

1.554***

1.376***

[0.786, 0.840]

[0.829, 1.688]

[0.943, 2.164]

[0.666, 2.086]

Gini coefficient

-0.026

[-0.059, 0.008]

Unemployment rate

-0.036***

[-0.055, -0.017]

Num.Obs.

1729

1727

1727

1297

R2

0.672

0.928

0.930

0.945

R2 Within

0.150

0.142

0.194

+ p < 0.1, * p < 0.05, ** p < 0.01, *** p < 0.001

95% CIs in brackets.

Column (1): HC1 standard errors.

Columns (2)–(4): clustered at country level.

Country and year fixed effects in columns (3) and (4).

Five rules for panel tables:

  1. Show the progression — pooled OLS to your preferred specification. The reader needs to see what each layer of controls does.
  2. Report CIs, not SEs — confidence intervals communicate both precision and the range of compatible effect sizes more directly than standard errors.
  3. Report which FEs are included — a footnote is cleaner than coefficient rows: “Country and year fixed effects in columns (3) and (4).”
  4. Report the within-\(R^2\) — use gof_map = c("nobs", "r.squared", "r2.within"). The overall \(R^2\) is inflated by the fixed effects.
  5. State the clustering — always note clustering level in the table footer.

5.2 Reporting confidence intervals in modelsummary

By default modelsummary shows standard errors below point estimates. To show 95% confidence intervals instead:

Code
modelsummary(
  list("Model" = m),
  statistic  = "conf.int",   # [lower, upper] replaces (SE)
  conf_level = 0.95
)

To show the estimate and CI on the same line (common in reports):

Code
modelsummary(
  list("Model" = m),
  estimate   = "{estimate} [{conf.low}, {conf.high}]",
  statistic  = NULL           # suppress the second row entirely
)

For a publication-ready flextable, add output = "flextable" and pipe through autofit():

Code
modelsummary(
  list("Model" = m),
  statistic = "conf.int",
  conf_level = 0.95,
  output    = "flextable"
) |>
  autofit()

5.3 Writing up the result

A clean analytical paragraph based on the actual results:

Table 1 shows the relationship between log GDP per capita and life satisfaction across countries and years. The pooled OLS estimate (column 1) suggests a positive association (β = 0.81, 95% CI [0.79, 0.84]), but this conflates stable between-country differences with within-country income changes. Contrary to a naive reading of the Easterlin paradox, country fixed effects (column 2) reveal a stronger within-country relationship (β = 1.26, 95% CI [0.83, 1.69]): countries experiencing income growth show larger happiness gains than the cross-sectional gap between rich and poor countries would predict. Adding year fixed effects (column 3) increases the coefficient further to β = 1.55 (95% CI [0.94, 2.16]), as global time trends that simultaneously affected incomes and happiness everywhere are removed. Column 4 adds time-varying controls for inequality and unemployment; the income coefficient falls modestly to β = 1.38 (95% CI [0.67, 2.09]), driven mainly by the significant negative effect of unemployment rather than by Gini (which is not significant). The evidence points to a robust, positive within-country income–happiness relationship — with unemployment emerging as an important independent determinant of within-country happiness variation.


6 Summary

What we did:

  1. Established the cross-sectional income–happiness correlation and the Easterlin challenge
  2. Ran the standard FE progression: pooled OLS → country FE → two-way FE
  3. Clarified what each model estimates and what confounders remain
  4. Distinguished three estimands: total effect, controlled direct effect, indirect effect
  5. Added time-varying confounders (Gini, unemployment) as Stage 1 extensions
  6. Explained why conditioning on WHR variables (health, trust) is the wrong way to address mediator concerns — and what the right way (mediation analysis) would look like

Key takeaways:

  • The within-country income–happiness relationship is stronger than the cross-sectional comparison — the FE coefficient exceeds pooled OLS
  • Country FE eliminates all stable confounders; year FE removes common time shocks
  • Unemployment is a significant time-varying confounder; Gini is not significant in this panel (possibly due to limited within-country variation and missing data)
  • WHR social/health variables are mediators — adding them gives a controlled direct effect, not a cleaner total effect, and may introduce new bias
  • Mediation analysis is the right framework for decomposing pathways; it requires stronger assumptions and is a separate analytical step

In R: fixest::feols(y ~ x + controls | country + year, vcov = ~country)


7 Further Reading

  • Békés, G. & Kézdi, G. (2021). Data Analysis for Business, Economics, and Policy. Cambridge UP. Ch. 24 (Panel data).
  • Huntington-Klein, N. (2022). The Effect. Ch. 16 (Fixed Effects). theeffectbook.net
  • Cunningham, S. (2021). Causal Inference: The Mixtape. Ch. 8 (Panel data). mixtape.scunning.com
  • Pearl, J. & Mackenzie, D. (2018). The Book of Why. Ch. 9 (Mediation) — accessible treatment of direct vs. indirect effects
  • Imai, K., Keele, L. & Tingley, D. (2010). A general approach to causal mediation analysis. Psychological Methods, 15(4), 309–334 — the standard reference for the mediation R package
  • Killingsworth, M. A., Kahneman, D., & Mellers, B. (2023). Income and emotional well-being: A conflict resolved. Proceedings of the National Academy of Sciences, 120(10), e2208661120. https://doi.org/10.1073/pnas.2208661120
  • World Happiness Report (annual, worldhappiness.report) — statistical appendices document their approach to decomposing happiness gaps across countries