Schedule & Overview
General reference
The main textbook for this course is:1
Békés, G., & Kézdi, G. (2021). Data Analysis for Business, Economics, and Policy. Cambridge University Press. https://gabors-data-analysis.com/
A complementary reference for R and tidyverse:
Wickham, H., Çetinkaya-Rundel, M., & Grolemund, G. (2023). R for Data Science (2nd ed.). O’Reilly. r4ds.hadley.nz
Ismay, C., & Kim, A. Y.-S. (2020). Statistical inference via data science: A ModernDive, into R and the tidyverse. CRC Press, Taylor and Francis Group. https://moderndive.com/index.html
For more advanced details on the fundamentals of programming in R, I recommend the following:
Wickham, H. (2019). Advanced R (Second edition). CRC Press/Taylor & Francis Group. https://adv-r.hadley.nz/
James, G., Witten, D., Hastie, T., & Tibshirani, R. (2021). An introduction to statistical learning: With applications in R (Second edition). Springer. https://www.statlearning.com/
Full schedule
| # | Date | Title | Communication skill |
|---|---|---|---|
| 1 | Thu, 12 Mar | Welcome, recap & tooling upgrade | Document structure, YAML, BibTeX bibliography |
| — | Thu, 19 Mar | Take-home task 1 (part 1) | First full Quarto report submitted independently |
| — | Thu, 26 Mar | Take-home task 1 (part 2) | |
| — | Thu, 02 Apr | Easter break | |
| 2 | Thu, 09 Apr | Multiple regression: going beyond the basics | Regression tables with modelsummary |
| — | Thu, 16 Apr | Take-home task 2 (part 1) | Regression table formatting, model comparison |
| — | Thu, 23 Apr | Take-home task 2 (part 2) | |
| - | Thu, 30 Apr | Canceled; replacement date TBD | |
| 3 | Thu, 07 May | Modelling binary and categorical outcomes | Inline R code for automatic result reporting |
| — | Thu, 14 May | No lecture | |
| 4 | Thu, 21 May | What can go wrong: biases and diagnostics | Diagnostic plots, figure captions, cross-references |
| 5 | Thu, 28 May | Causation vs. correlation | TBD |
| 6 | Thu, 04 Jun | Panel data and fixed effects | Panel model output with fixest |
| 7 | Thu, 11 Jun | Using AI tools: advanced visualizations as an example | Publication-ready figures; AI-assisted Quarto: debugging, improving prose |
| 8 | Thu, 18 Jun | Recap and looking ahead |
The exam takes place in room MAD 131. The exam lasts 120 minutes, so make sure you are in the classroom at the latest 15:50. It is an open book exam that is written via Moodle. Please make sure you register for the exam on time!
The exam takes place in room HEL 165. The exam lasts 120 minutes, so make sure you are in the classroom at the latest 15:50. It is an open book exam that is written via Moodle. Please make sure you register for the exam on time!
Datasets
Most datasets used in this course come from the open-access repository of Békés & Kézdi (2021), available at gabors-data-analysis.com. Additional data is drawn from publicly available sources including the World Bank World Development Indicators and Eurostat.
| Session | Dataset | Business Question |
|---|---|---|
| 2 | hotels-vienna | What features drive hotel prices in Vienna? |
| 3 | bisnode-firms | Which firms are likely to exit the market? |
| 4 | hotels-europe | Do regression assumptions hold across 46 cities? |
| 5 | cps-earnings | What explains the gender wage gap? |
| 6 | wms-management-survey | Does management quality predict firm performance? |
| 7 | working-from-home | Does WFH improve employee performance? |
| 8 | worldbank-lifeexpectancy + world-bank-immunization | How do health outcomes relate to income globally? |
| Task 1 | cps-earnings | What determines weekly earnings for market research analysts? |
| Task 2 | hotels-europe | What drives hotel prices across European cities, and does distance to the city centre matter more in capitals? |
Take-home tasks
Two mandatory tasks are assigned during instructor-unavailable periods and submitted as rendered Quarto HTML reports. They are graded pass/fail and use separate datasets from in-person sessions.
| Task | Assigned | Due |
|---|---|---|
| Task 1 — Earnings, age, and hours worked | End of Session 1 (12 Mar) | 02 Apr, 23:59 |
| Task 2 — Hotel prices across European cities | End of Session 2 (09 Apr) | 29 Apr, 23:59 |
Footnotes
There is no need to buy the book; the materials for each session provided here are self-contained. Still, its a good book that I can recommend reading.↩︎