Session 7: Using AI Tools — Live Demo

Advanced Data Science · Europa-Universität Flensburg

Author

Claudius Gräbner-Radkowitsch

Published

11 06 2026

0.1 Business question

How does management quality predict wages, and how does this relationship vary by industry — and can we build and communicate this evidence more efficiently using AI tools?

This session has two interlocking goals: understanding what AI tools are actually good for in a data science workflow, and using that understanding to build a publication-ready figure. The analysis in Part 2 uses simulated firm-level data with the same structure as the World Management Survey.


1 Setup

1.1 Packages

Code
library(tidyverse)
library(fixest)
library(modelsummary)
library(flextable)
library(broom)
library(patchwork)
library(ggrepel)
library(gghighlight)
library(scales)
library(gapminder)

1.2 Simulated WMS-style data

The demo uses simulated panel data structured like the WMS: 100 firms observed across 5 years, with a management quality score, log wage, and industry.

Code
set.seed(42)
fake_panel <- tibble(
  firm_id  = rep(1:100, each = 5),
  year     = rep(2010:2014, times = 100),
  mgmt     = rnorm(500, mean = 3, sd = 0.8),
  ln_wage  = 2.5 + 0.3 * mgmt + rnorm(500, sd = 0.5),
  industry = sample(c("Mfg", "Retail", "Finance"), 500, replace = TRUE)
)

1.3 Fixed-effects model

Code
model_fe <- feols(
  ln_wage ~ mgmt | firm_id + year,
  cluster = ~firm_id,
  data    = fake_panel
)

modelsummary(
  list("Management → Log wage" = model_fe),
  statistic  = "conf.int",
  conf_level = 0.95,
  gof_map    = c("nobs", "r2.within"),
  notes      = "Fixed effects: firm, year. SEs clustered by firm.",
  output     = "flextable"
) |> autofit()

Management → Log wage

mgmt

0.284

[0.222, 0.345]

Num.Obs.

500

R2 Within

0.159

Fixed effects: firm, year. SEs clustered by firm.


2 Part 1: AI workflow in practice

The structure behind every AI interaction in this session:

prompt  →  inspect  →  run  →  diagnose
   ↑                              │
   └──────────────────────────────┘

Read the output before running it. Understand why something fails before re-prompting. This cycle applies regardless of which mode you are using.

2.1 Thought partner mode

Prompt used:

I have a panel dataset of firm-level data — management quality scores and wages — across industries and years. I estimated a fixed-effects regression with feols() from the fixest package in R. I want to show how management quality relates to wages. What visualization types should I consider, and which would be most appropriate for presenting regression results in an economics paper?

A good response in thought partner mode surfaces multiple options and may engage with the panel structure and feols() output. The evaluation criterion: we want to show point estimates with uncertainty, which rules out plain scatter plots and bar charts showing means. The literature uses a coefficient plot (geom_pointrange()); a good response will suggest it, but the decision is yours.

Thought partner mode surfaces options efficiently. The domain judgment — which chart type fits the research question — remains yours.

2.2 Package hallucination

Prompt used:

I have a feols() model from the fixest package with clustered standard errors. I want to create a coefficient plot in ggplot2 that automatically handles the clustered SEs and adds confidence intervals. Is there a helper function in fixest or modelsummary that makes this easier to plot directly?

The AI may suggest a function such as fixest::ggcoef(), fixest::coefplot2(), or modelsummary::modelplot_se(). None of these exist. The arguments are described in confident, plausible-sounding detail.

Test:

Code
fixest::ggcoef(model_fe)
Error: 'ggcoef' is not an exported object from 'namespace:fixest'

The error is immediate. Next step: verify whether the function exists at all.

Code
?fixest::ggcoef   # no documentation found
??ggcoef          # not in fixest

The correct approachbroom::tidy() extracts estimates and confidence intervals; geom_pointrange() plots them:

Code
tidy_model <- tidy(model_fe, conf.int = TRUE)

ggplot(tidy_model,
       aes(x = estimate, y = term,
           xmin = conf.low, xmax = conf.high)) +
  geom_pointrange() +
  geom_vline(xintercept = 0, linetype = "dashed", color = "grey50") +
  theme_classic(base_size = 11) +
  labs(
    x       = "Coefficient estimate (95% CI)",
    y       = NULL,
    caption = "Fixed effects: firm and year. Clustered SEs by firm."
  )

AI confidence is not evidence of a function’s existence. Before running any suggested package or function, check with ?.

2.3 Stale API

Prompt used:

In ggplot2, I want to draw a line plot with a line thickness of 1.5. What argument do I use in geom_line()?

What AI emits: geom_line(size = 1.5). This is stale: the size aesthetic for lines was deprecated in ggplot2 3.4.0 in favour of linewidth.

Code
ggplot(gapminder |> filter(country == "Germany"),
       aes(x = year, y = lifeExp)) +
  geom_line(size = 1.5)   # deprecated since ggplot2 3.4.0
Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
ℹ Please use `linewidth` instead.

The deprecation warning names the correct replacement exactly. The fix:

Code
ggplot(gapminder |> filter(country == "Germany"),
       aes(x = year, y = lifeExp)) +
  geom_line(linewidth = 1.5)   # correct

The model was trained on code that pre-dates this API change. It cannot know what changed after its training cutoff. Check recently-changed arguments against ?geom_line before trusting them.

2.4 Code reviewer mode

Flawed figure submitted for review:

Code
gapminder |>
  filter(year == 2007) |>
  ggplot(aes(x = gdpPercap, y = lifeExp, color = continent)) +
  geom_point() +
  geom_smooth()

Prompt used:

Here is a figure I made for a paper in economics. What would a peer reviewer or journal editor say about it? Be specific.

What good feedback looks like: The AI correctly flags missing axis labels with units, no log scale on GDP per capita (which distorts the relationship), no source caption, the default grey theme, and no method specified in geom_smooth(). These map directly onto the 7-item checklist.

The AI may also critique the point alpha or size as stylistic preferences. Evaluate all feedback against the checklist rather than accepting it wholesale.

The improved figure:

Code
gapminder |>
  filter(year == 2007) |>
  ggplot(aes(x = gdpPercap, y = lifeExp, color = continent)) +
  geom_point(alpha = 0.7, size = 2) +
  geom_smooth(method = "lm", se = TRUE, linewidth = 0.8) +
  scale_x_log10(labels = label_dollar()) +
  scale_color_viridis_d(name = NULL) +
  theme_classic(base_size = 11) +
  labs(
    x       = "GDP per capita (USD, log scale)",
    y       = "Life expectancy (years)",
    caption = "Source: Gapminder. Year: 2007."
  )

Code reviewer mode gives useful, actionable feedback on well-established standards. Evaluate selectively: some feedback will be correct, some stylistic preference.


3 Part 2: Publication-ready figure

3.1 Research question and chart type decision

Decided before opening the AI:

Research question: How does management quality predict wages, and how does this relationship vary by industry?

Chart type decision:

  • Panel A — the overall coefficient estimate with uncertainty: a coefficient plot (geom_pointrange())
  • Panel B — the bivariate relationship heterogeneous across industries: a faceted scatter with per-industry regression lines

Two panels together tell the complete story: the aggregate effect and the heterogeneity underneath it.

3.2 Panel A — Coefficient plot

Code
p_a <- tidy_model |>
  ggplot(aes(x = estimate, y = term,
             xmin = conf.low, xmax = conf.high)) +
  geom_pointrange(color = "#1e4a7b") +
  geom_vline(xintercept = 0, linetype = "dashed", color = "grey60") +
  theme_classic(base_size = 11) +
  labs(
    x = "Coefficient (95% CI)",
    y = NULL,
    title = "Panel A: Fixed-effects estimate"
  )

p_a

Note

Inspect the tidy_model data frame before plotting: check that term contains what you expect and that the confidence intervals look plausible. The inspect step applies to AI-generated code and to your own.

3.3 Panel B — Relationship by industry

The first AI-scaffolded draft used gghighlight() inside facet_wrap(). After running it, the highlight was visually redundant with the facets — a judgment call that required looking at the output, not re-prompting.

Code
p_b <- fake_panel |>
  ggplot(aes(x = mgmt, y = ln_wage)) +
  geom_point(alpha = 0.4, size = 1.5, color = "#1e4a7b") +
  geom_smooth(method = "lm", se = TRUE,
              linewidth = 0.9, color = "#2d8a4e") +
  facet_wrap(~industry) +
  theme_classic(base_size = 11) +
  theme(
    strip.background = element_blank(),
    strip.text       = element_text(face = "bold")
  ) +
  labs(
    x = "Management score",
    y = "Log wage",
    title = "Panel B: Relationship by industry"
  )

p_b

3.4 Composed figure — patchwork

Code
p_combined <- p_a | p_b

p_combined +
  plot_annotation(
    tag_levels = "A",
    caption    = "Source: Simulated WMS-style data. Fixed effects: firm, year. SEs clustered by firm."
  ) &
  theme(plot.tag = element_text(face = "bold"))

Note

& vs + in patchwork. The & operator applies a theme element to all panels simultaneously. Using + instead applies it only to the last panel. AI consistently emits + here — this is the most common patchwork mistake, and it is caught in the inspect step, not from an error message.

3.5 Export

Code
ggsave(
  filename = "figures/session07_demo_figure.pdf",
  plot     = p_combined,
  width    = 8,
  height   = 4,
  units    = "in",
  device   = cairo_pdf
)

cairo_pdf embeds fonts correctly for most journal submission systems. Always specify width, height, and units explicitly — the RStudio Export button does not.


4 Key takeaways

AI workflow

  • Thought partner mode surfaces options efficiently; the domain judgment is yours
  • Package hallucination is structural — verify function existence with ? before running
  • Stale API knowledge is unavoidable — check recently-changed arguments against documentation
  • Code reviewer mode gives actionable feedback on established standards; evaluate selectively
  • The invariant habit across all modes: prompt → inspect → run → diagnose

Publication-ready figures

  • Choose the chart type before prompting — AI defaults to bar charts
  • AI scaffolds ~70% of a publication-ready figure; the remaining 30% requires visual judgment
  • broom::tidy(conf.int = TRUE) + geom_pointrange() is the standard coefficient plot workflow in R
  • In patchwork: use & (not +) when applying theme elements across all panels
  • ggsave() with explicit dimensions and device = cairo_pdf for submission-compatible output

5 Further reading