Code
library(tidyverse)
library(fixest)
library(modelsummary)
library(flextable)
library(broom)
library(patchwork)
library(ggrepel)
library(gghighlight)
library(scales)
library(gapminder)Advanced Data Science · Europa-Universität Flensburg
Claudius Gräbner-Radkowitsch
11 06 2026
How does management quality predict wages, and how does this relationship vary by industry — and can we build and communicate this evidence more efficiently using AI tools?
This session has two interlocking goals: understanding what AI tools are actually good for in a data science workflow, and using that understanding to build a publication-ready figure. The analysis in Part 2 uses simulated firm-level data with the same structure as the World Management Survey.
The demo uses simulated panel data structured like the WMS: 100 firms observed across 5 years, with a management quality score, log wage, and industry.
model_fe <- feols(
ln_wage ~ mgmt | firm_id + year,
cluster = ~firm_id,
data = fake_panel
)
modelsummary(
list("Management → Log wage" = model_fe),
statistic = "conf.int",
conf_level = 0.95,
gof_map = c("nobs", "r2.within"),
notes = "Fixed effects: firm, year. SEs clustered by firm.",
output = "flextable"
) |> autofit()
| Management → Log wage |
|---|---|
mgmt | 0.284 |
[0.222, 0.345] | |
Num.Obs. | 500 |
R2 Within | 0.159 |
Fixed effects: firm, year. SEs clustered by firm. | |
The structure behind every AI interaction in this session:
prompt → inspect → run → diagnose
↑ │
└──────────────────────────────┘
Read the output before running it. Understand why something fails before re-prompting. This cycle applies regardless of which mode you are using.
Prompt used:
I have a panel dataset of firm-level data — management quality scores and wages — across industries and years. I estimated a fixed-effects regression with
feols()from thefixestpackage in R. I want to show how management quality relates to wages. What visualization types should I consider, and which would be most appropriate for presenting regression results in an economics paper?
A good response in thought partner mode surfaces multiple options and may engage with the panel structure and feols() output. The evaluation criterion: we want to show point estimates with uncertainty, which rules out plain scatter plots and bar charts showing means. The literature uses a coefficient plot (geom_pointrange()); a good response will suggest it, but the decision is yours.
Thought partner mode surfaces options efficiently. The domain judgment — which chart type fits the research question — remains yours.
Prompt used:
I have a
feols()model from thefixestpackage with clustered standard errors. I want to create a coefficient plot in ggplot2 that automatically handles the clustered SEs and adds confidence intervals. Is there a helper function infixestormodelsummarythat makes this easier to plot directly?
The AI may suggest a function such as fixest::ggcoef(), fixest::coefplot2(), or modelsummary::modelplot_se(). None of these exist. The arguments are described in confident, plausible-sounding detail.
Test:
The error is immediate. Next step: verify whether the function exists at all.
The correct approach — broom::tidy() extracts estimates and confidence intervals; geom_pointrange() plots them:
tidy_model <- tidy(model_fe, conf.int = TRUE)
ggplot(tidy_model,
aes(x = estimate, y = term,
xmin = conf.low, xmax = conf.high)) +
geom_pointrange() +
geom_vline(xintercept = 0, linetype = "dashed", color = "grey50") +
theme_classic(base_size = 11) +
labs(
x = "Coefficient estimate (95% CI)",
y = NULL,
caption = "Fixed effects: firm and year. Clustered SEs by firm."
)AI confidence is not evidence of a function’s existence. Before running any suggested package or function, check with ?.
Prompt used:
In ggplot2, I want to draw a line plot with a line thickness of 1.5. What argument do I use in
geom_line()?
What AI emits: geom_line(size = 1.5). This is stale: the size aesthetic for lines was deprecated in ggplot2 3.4.0 in favour of linewidth.
Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
ℹ Please use `linewidth` instead.
The deprecation warning names the correct replacement exactly. The fix:
The model was trained on code that pre-dates this API change. It cannot know what changed after its training cutoff. Check recently-changed arguments against ?geom_line before trusting them.
Flawed figure submitted for review:
Prompt used:
Here is a figure I made for a paper in economics. What would a peer reviewer or journal editor say about it? Be specific.
What good feedback looks like: The AI correctly flags missing axis labels with units, no log scale on GDP per capita (which distorts the relationship), no source caption, the default grey theme, and no method specified in geom_smooth(). These map directly onto the 7-item checklist.
The AI may also critique the point alpha or size as stylistic preferences. Evaluate all feedback against the checklist rather than accepting it wholesale.
The improved figure:
gapminder |>
filter(year == 2007) |>
ggplot(aes(x = gdpPercap, y = lifeExp, color = continent)) +
geom_point(alpha = 0.7, size = 2) +
geom_smooth(method = "lm", se = TRUE, linewidth = 0.8) +
scale_x_log10(labels = label_dollar()) +
scale_color_viridis_d(name = NULL) +
theme_classic(base_size = 11) +
labs(
x = "GDP per capita (USD, log scale)",
y = "Life expectancy (years)",
caption = "Source: Gapminder. Year: 2007."
)Code reviewer mode gives useful, actionable feedback on well-established standards. Evaluate selectively: some feedback will be correct, some stylistic preference.
Decided before opening the AI:
Research question: How does management quality predict wages, and how does this relationship vary by industry?
Chart type decision:
geom_pointrange())Two panels together tell the complete story: the aggregate effect and the heterogeneity underneath it.
p_a <- tidy_model |>
ggplot(aes(x = estimate, y = term,
xmin = conf.low, xmax = conf.high)) +
geom_pointrange(color = "#1e4a7b") +
geom_vline(xintercept = 0, linetype = "dashed", color = "grey60") +
theme_classic(base_size = 11) +
labs(
x = "Coefficient (95% CI)",
y = NULL,
title = "Panel A: Fixed-effects estimate"
)
p_aInspect the tidy_model data frame before plotting: check that term contains what you expect and that the confidence intervals look plausible. The inspect step applies to AI-generated code and to your own.
The first AI-scaffolded draft used gghighlight() inside facet_wrap(). After running it, the highlight was visually redundant with the facets — a judgment call that required looking at the output, not re-prompting.
p_b <- fake_panel |>
ggplot(aes(x = mgmt, y = ln_wage)) +
geom_point(alpha = 0.4, size = 1.5, color = "#1e4a7b") +
geom_smooth(method = "lm", se = TRUE,
linewidth = 0.9, color = "#2d8a4e") +
facet_wrap(~industry) +
theme_classic(base_size = 11) +
theme(
strip.background = element_blank(),
strip.text = element_text(face = "bold")
) +
labs(
x = "Management score",
y = "Log wage",
title = "Panel B: Relationship by industry"
)
p_b& vs + in patchwork. The & operator applies a theme element to all panels simultaneously. Using + instead applies it only to the last panel. AI consistently emits + here — this is the most common patchwork mistake, and it is caught in the inspect step, not from an error message.
cairo_pdf embeds fonts correctly for most journal submission systems. Always specify width, height, and units explicitly — the RStudio Export button does not.
AI workflow
? before runningPublication-ready figures
broom::tidy(conf.int = TRUE) + geom_pointrange() is the standard coefficient plot workflow in Rpatchwork: use & (not +) when applying theme elements across all panelsggsave() with explicit dimensions and device = cairo_pdf for submission-compatible outputpatchwork documentation: patchwork.data-imaginist.comggrepel documentation: cran.r-project.org/package=ggrepelbroom documentation: broom.tidymodels.org---
title: "Session 7: Using AI Tools — Live Demo"
subtitle: "Advanced Data Science · Europa-Universität Flensburg"
author: "Claudius Gräbner-Radkowitsch"
date: "2026-06-11"
format:
html:
toc: true
toc-depth: 3
number-sections: true
code-fold: true
code-tools: true
self-contained: true
execute:
echo: true
warning: false
message: false
execute-dir: file
---
## Business question
> *How does management quality predict wages, and how does this relationship vary by industry — and can we build and communicate this evidence more efficiently using AI tools?*
This session has two interlocking goals: understanding what AI tools are actually good for in a data science workflow, and using that understanding to build a publication-ready figure. The analysis in Part 2 uses simulated firm-level data with the same structure as the World Management Survey.
---
# Setup
## Packages
```{r}
#| label: setup
library(tidyverse)
library(fixest)
library(modelsummary)
library(flextable)
library(broom)
library(patchwork)
library(ggrepel)
library(gghighlight)
library(scales)
library(gapminder)
```
## Simulated WMS-style data
The demo uses simulated panel data structured like the WMS: 100 firms observed across 5 years, with a management quality score, log wage, and industry.
```{r}
#| label: fake-data
set.seed(42)
fake_panel <- tibble(
firm_id = rep(1:100, each = 5),
year = rep(2010:2014, times = 100),
mgmt = rnorm(500, mean = 3, sd = 0.8),
ln_wage = 2.5 + 0.3 * mgmt + rnorm(500, sd = 0.5),
industry = sample(c("Mfg", "Retail", "Finance"), 500, replace = TRUE)
)
```
## Fixed-effects model
```{r}
#| label: model-fe
model_fe <- feols(
ln_wage ~ mgmt | firm_id + year,
cluster = ~firm_id,
data = fake_panel
)
modelsummary(
list("Management → Log wage" = model_fe),
statistic = "conf.int",
conf_level = 0.95,
gof_map = c("nobs", "r2.within"),
notes = "Fixed effects: firm, year. SEs clustered by firm.",
output = "flextable"
) |> autofit()
```
---
# Part 1: AI workflow in practice
The structure behind every AI interaction in this session:
```
prompt → inspect → run → diagnose
↑ │
└──────────────────────────────┘
```
Read the output before running it. Understand why something fails before re-prompting. This cycle applies regardless of which mode you are using.
## Thought partner mode
**Prompt used:**
> I have a panel dataset of firm-level data — management quality scores and wages — across industries and years. I estimated a fixed-effects regression with `feols()` from the `fixest` package in R. I want to show how management quality relates to wages. What visualization types should I consider, and which would be most appropriate for presenting regression results in an economics paper?
A good response in thought partner mode surfaces multiple options and may engage with the panel structure and `feols()` output. The evaluation criterion: we want to show **point estimates with uncertainty**, which rules out plain scatter plots and bar charts showing means. The literature uses a **coefficient plot** (`geom_pointrange()`); a good response will suggest it, but the decision is yours.
Thought partner mode surfaces options efficiently. The domain judgment — which chart type fits the research question — remains yours.
## Package hallucination
**Prompt used:**
> I have a `feols()` model from the `fixest` package with clustered standard errors. I want to create a coefficient plot in ggplot2 that automatically handles the clustered SEs and adds confidence intervals. Is there a helper function in `fixest` or `modelsummary` that makes this easier to plot directly?
The AI may suggest a function such as `fixest::ggcoef()`, `fixest::coefplot2()`, or `modelsummary::modelplot_se()`. None of these exist. The arguments are described in confident, plausible-sounding detail.
**Test:**
```{r}
#| label: hallucination-test
#| error: true
fixest::ggcoef(model_fe)
```
The error is immediate. Next step: verify whether the function exists at all.
```{r}
#| eval: false
?fixest::ggcoef # no documentation found
??ggcoef # not in fixest
```
**The correct approach** — `broom::tidy()` extracts estimates and confidence intervals; `geom_pointrange()` plots them:
```{r}
#| label: correct-coefplot
tidy_model <- tidy(model_fe, conf.int = TRUE)
ggplot(tidy_model,
aes(x = estimate, y = term,
xmin = conf.low, xmax = conf.high)) +
geom_pointrange() +
geom_vline(xintercept = 0, linetype = "dashed", color = "grey50") +
theme_classic(base_size = 11) +
labs(
x = "Coefficient estimate (95% CI)",
y = NULL,
caption = "Fixed effects: firm and year. Clustered SEs by firm."
)
```
AI confidence is not evidence of a function's existence. Before running any suggested package or function, check with `?`.
## Stale API
**Prompt used:**
> In ggplot2, I want to draw a line plot with a line thickness of 1.5. What argument do I use in `geom_line()`?
**What AI emits:** `geom_line(size = 1.5)`. This is stale: the `size` aesthetic for lines was deprecated in ggplot2 3.4.0 in favour of `linewidth`.
```{r}
#| label: stale-api
#| warning: true
ggplot(gapminder |> filter(country == "Germany"),
aes(x = year, y = lifeExp)) +
geom_line(size = 1.5) # deprecated since ggplot2 3.4.0
```
The deprecation warning names the correct replacement exactly. The fix:
```{r}
#| label: correct-linewidth
ggplot(gapminder |> filter(country == "Germany"),
aes(x = year, y = lifeExp)) +
geom_line(linewidth = 1.5) # correct
```
The model was trained on code that pre-dates this API change. It cannot know what changed after its training cutoff. Check recently-changed arguments against `?geom_line` before trusting them.
## Code reviewer mode
**Flawed figure submitted for review:**
```{r}
#| label: flawed-figure
gapminder |>
filter(year == 2007) |>
ggplot(aes(x = gdpPercap, y = lifeExp, color = continent)) +
geom_point() +
geom_smooth()
```
**Prompt used:**
> Here is a figure I made for a paper in economics. What would a peer reviewer or journal editor say about it? Be specific.
**What good feedback looks like:** The AI correctly flags missing axis labels with units, no log scale on GDP per capita (which distorts the relationship), no source caption, the default grey theme, and no method specified in `geom_smooth()`. These map directly onto the 7-item checklist.
The AI may also critique the point alpha or size as stylistic preferences. Evaluate all feedback against the checklist rather than accepting it wholesale.
**The improved figure:**
```{r}
#| label: improved-figure
gapminder |>
filter(year == 2007) |>
ggplot(aes(x = gdpPercap, y = lifeExp, color = continent)) +
geom_point(alpha = 0.7, size = 2) +
geom_smooth(method = "lm", se = TRUE, linewidth = 0.8) +
scale_x_log10(labels = label_dollar()) +
scale_color_viridis_d(name = NULL) +
theme_classic(base_size = 11) +
labs(
x = "GDP per capita (USD, log scale)",
y = "Life expectancy (years)",
caption = "Source: Gapminder. Year: 2007."
)
```
Code reviewer mode gives useful, actionable feedback on well-established standards. Evaluate selectively: some feedback will be correct, some stylistic preference.
---
# Part 2: Publication-ready figure
## Research question and chart type decision
Decided *before* opening the AI:
**Research question:** How does management quality predict wages, and how does this relationship vary by industry?
**Chart type decision:**
- *Panel A* — the overall coefficient estimate with uncertainty: a coefficient plot (`geom_pointrange()`)
- *Panel B* — the bivariate relationship heterogeneous across industries: a faceted scatter with per-industry regression lines
Two panels together tell the complete story: the aggregate effect and the heterogeneity underneath it.
## Panel A — Coefficient plot
```{r}
#| label: panel-a
p_a <- tidy_model |>
ggplot(aes(x = estimate, y = term,
xmin = conf.low, xmax = conf.high)) +
geom_pointrange(color = "#1e4a7b") +
geom_vline(xintercept = 0, linetype = "dashed", color = "grey60") +
theme_classic(base_size = 11) +
labs(
x = "Coefficient (95% CI)",
y = NULL,
title = "Panel A: Fixed-effects estimate"
)
p_a
```
::: {.callout-note}
Inspect the `tidy_model` data frame before plotting: check that `term` contains what you expect and that the confidence intervals look plausible. The inspect step applies to AI-generated code and to your own.
:::
## Panel B — Relationship by industry
The first AI-scaffolded draft used `gghighlight()` inside `facet_wrap()`. After running it, the highlight was visually redundant with the facets — a judgment call that required looking at the output, not re-prompting.
```{r}
#| label: panel-b
p_b <- fake_panel |>
ggplot(aes(x = mgmt, y = ln_wage)) +
geom_point(alpha = 0.4, size = 1.5, color = "#1e4a7b") +
geom_smooth(method = "lm", se = TRUE,
linewidth = 0.9, color = "#2d8a4e") +
facet_wrap(~industry) +
theme_classic(base_size = 11) +
theme(
strip.background = element_blank(),
strip.text = element_text(face = "bold")
) +
labs(
x = "Management score",
y = "Log wage",
title = "Panel B: Relationship by industry"
)
p_b
```
## Composed figure — patchwork
```{r}
#| label: composed-figure
p_combined <- p_a | p_b
p_combined +
plot_annotation(
tag_levels = "A",
caption = "Source: Simulated WMS-style data. Fixed effects: firm, year. SEs clustered by firm."
) &
theme(plot.tag = element_text(face = "bold"))
```
::: {.callout-note}
**`&` vs `+` in patchwork.** The `&` operator applies a theme element to *all* panels simultaneously. Using `+` instead applies it only to the last panel. AI consistently emits `+` here — this is the most common patchwork mistake, and it is caught in the inspect step, not from an error message.
:::
## Export
```{r}
#| label: export
#| eval: false
ggsave(
filename = "figures/session07_demo_figure.pdf",
plot = p_combined,
width = 8,
height = 4,
units = "in",
device = cairo_pdf
)
```
`cairo_pdf` embeds fonts correctly for most journal submission systems. Always specify `width`, `height`, and `units` explicitly — the RStudio Export button does not.
---
# Key takeaways
**AI workflow**
- Thought partner mode surfaces options efficiently; the domain judgment is yours
- Package hallucination is structural — verify function existence with `?` before running
- Stale API knowledge is unavoidable — check recently-changed arguments against documentation
- Code reviewer mode gives actionable feedback on established standards; evaluate selectively
- The invariant habit across all modes: **prompt → inspect → run → diagnose**
**Publication-ready figures**
- Choose the chart type before prompting — AI defaults to bar charts
- AI scaffolds ~70% of a publication-ready figure; the remaining 30% requires visual judgment
- `broom::tidy(conf.int = TRUE)` + `geom_pointrange()` is the standard coefficient plot workflow in R
- In `patchwork`: use `&` (not `+`) when applying theme elements across all panels
- `ggsave()` with explicit dimensions and `device = cairo_pdf` for submission-compatible output
---
# Further reading
- Wilke, C.O. (2019). *Fundamentals of Data Visualization*. O'Reilly. [clauswilke.com/dataviz](https://clauswilke.com/dataviz/) — basis of the ugly/bad/wrong taxonomy
- Healy, K. (2018). *Data Visualization: A Practical Introduction*. Princeton UP. [socviz.co](https://socviz.co/)
- `patchwork` documentation: [patchwork.data-imaginist.com](https://patchwork.data-imaginist.com)
- `ggrepel` documentation: [cran.r-project.org/package=ggrepel](https://cran.r-project.org/web/packages/ggrepel/)
- `broom` documentation: [broom.tidymodels.org](https://broom.tidymodels.org)