Schedule & Overview

General reference

The main textbook for this course is:

TBD

A complementary reference for R and tidyverse:

Wickham, H., Çetinkaya-Rundel, M., & Grolemund, G. (2023). R for Data Science (2nd ed.). O’Reilly. r4ds.hadley.nz

Ismay, C., & Kim, A. Y.-S. (2020). Statistical inference via data science: A ModernDive, into R and the tidyverse. CRC Press, Taylor and Francis Group. https://moderndive.com/index.html

For more advanced details on the fundamentals of programming in R, I recommend the following:

Wickham, H. (2019). Advanced R (Second edition). CRC Press/Taylor & Francis Group. https://adv-r.hadley.nz/

James, G., Witten, D., Hastie, T., & Tibshirani, R. (2021). An introduction to statistical learning: With applications in R (Second edition). Springer. https://www.statlearning.com/


Full schedule

# Date Title Communication skill
1 Thu, 12 Mar Welcome, recap & tooling upgrade Document structure, YAML, BibTeX bibliography
Thu, 19 Mar Take-home task 1 (part 1) First full Quarto report submitted independently
Thu, 26 Mar Take-home task 1 (part 2)
Thu, 02 Apr Easter break
2 Thu, 09 Apr Multiple regression: going beyond the basics Regression tables with modelsummary
Thu, 16 Apr Take-home task 2 (part 1) Regression table formatting, model comparison
Thu, 23 Apr Take-home task 2 (part 2)
3 Thu, 30 Apr Modelling binary and categorical outcomes Inline R code for automatic result reporting
4 Thu, 07 May What can go wrong: biases and diagnostics Diagnostic plots, figure captions, cross-references
Thu, 14 May No lecture
5 Thu, 21 May Causation vs. correlation: thinking like an economist Structuring an analytical narrative
6 Thu, 28 May Panel data and fixed effects Panel model output with fixest
7 Thu, 04 Jun Coding smarter: R and AI tools AI-assisted Quarto: debugging, improving prose
8 Thu, 11 Jun Communicating data: advanced visualization Publication-ready figures, multi-format export, Word output
9 Thu, 18 Jun Recap and looking ahead

Datasets

Most datasets used in this course come from the open-access repository of Békés & Kézdi (2021), available at gabors-data-analysis.com. Additional data is drawn from publicly available sources including the World Bank World Development Indicators and Eurostat.

Session Dataset Business Question
2 hotels-vienna What features drive hotel prices in Vienna?
3 bisnode-firms Which firms are likely to exit the market?
4 hotels-europe Do regression assumptions hold across 46 cities?
5 cps-earnings What explains the gender wage gap?
6 wms-management-survey Does management quality predict firm performance?
7 working-from-home Does WFH improve employee performance?
8 worldbank-lifeexpectancy + world-bank-immunization How do health outcomes relate to income globally?
Task 1 cps-earnings What determines weekly earnings for market research analysts?
Task 2 hotels-europe What drives hotel prices across European cities, and does distance to the city centre matter more in capitals?

Take-home tasks

Two mandatory tasks are assigned during instructor-unavailable periods and submitted as rendered Quarto HTML reports. They are graded pass/fail and use separate datasets from in-person sessions.

Task Assigned Due
Task 1 — Earnings, age, and hours worked End of Session 1 (12 Mar) 02 Apr, 23:59
Task 2 — Hotel prices across European cities End of Session 2 (09 Apr) 29 Apr, 23:59