Session 1 — Exercise
Setting Up a Reproducible Workflow
Your task today
Scenario: You are a junior analyst at an international consultancy. Your team has been asked to prepare a short briefing on global economic development — specifically, how income levels relate to health outcomes across countries. Before writing any analysis, your team requires all reports to follow a standard document structure: a proper YAML header, numbered sections, and a bibliography. Your task today is to build that structure and populate it with a first piece of real analysis.
Work through the six tasks below. Each task has a check — a short description of what you should see when you have done it correctly.
Using Github Classroom
This in-class exercise is available as a GitHub Classroom assignment — not the usual format for in-class work, but a deliberate chance to practise the submission workflow before it counts. Accept the assignment and complete it as you would a take-home task.
Click here: Accept Assignment on Github Classroom.
Check out the notes in the README of the task.
Task 1: Project setup
Set up a clean R project with the standard folder structure.
Steps:
- In RStudio, go to File → New Project → New Directory → New Project
- Give it a meaningful name (e.g.
my-analysis) and choose a sensible location - Once the project opens, create three folders inside it:
data/— for raw data filesfigures/— for exported plotsoutput/— for rendered reports
- By default, Git ignores empty directories. If you want to commit them, add an empty placeholder file called
.gitkeepinside each folder — see theREADMEfor details. - Create a new Quarto document: File → New File → Quarto Document Save it as
report.qmdin your project root
If you work in a codespace: No need to create a new directory — the project is already set up. Skip steps 1–2 and start from step 3.
Check: Your Files pane should show:
my-analysis/
├── my-analysis.Rproj
├── report.qmd
├── data/
├── figures/
└── output/
Task 2: YAML front matter
Replace the default YAML header in report.qmd with a complete one.
Steps:
Delete whatever Quarto put in the YAML by default and replace it with this — filling in your own title and name:
---
title: "Your Report Title"
author: "Your Name"
date: today
format:
html:
toc: true
number-sections: true
docx:
toc: true
number-sections: true
bibliography: references.bib
---Check: The document has a YAML block between --- markers at the top. If you render now, Quarto will warn about a missing references.bib — that is expected and fixed in Task 5.
date: today tells Quarto to use today’s date automatically every time the document renders — no manual updating needed.
Task 3: Document structure
Add meaningful content sections to your report.
Steps:
Below the YAML, create the following four sections using Markdown headings. Fill in the placeholder sentences with your own words — a sentence or two per section is enough for now.
## Introduction
This report examines how economic development relates to health outcomes
across countries. [Add one sentence on why this matters for international
business or policy.]
## Data
The analysis uses data from the `gapminder` package, which provides
information on GDP per capita, life expectancy, and population for
142 countries from 1952 to 2007. [Add one sentence describing what
you will focus on.]
## Analysis
[Your figure will go here — you will add it in Task 4.]
@fig-gapminder shows that ... [describe the pattern in one or two sentences].
## ReferencesCheck: When you render to HTML (Ctrl/Cmd + Shift + K), you should see:
- A numbered table of contents on the left
- Four numbered sections in the document body
Task 4: R code chunk
Add a working R code chunk that installs the data and produces a figure.
Steps:
- Insert a code chunk in your Analysis section: Ctrl/Cmd + Alt + I (or use the Insert menu)
- Copy the code below into the chunk — it is ready to run as-is
```{r}
#| label: fig-gapminder
#| fig-cap: "GDP per capita and life expectancy across countries (2007)."
#| warning: false
#| message: false
# Install the gapminder package if you haven't already:
# install.packages("gapminder")
library(tidyverse)
library(gapminder)
gapminder |>
filter(year == 2007) |>
ggplot(aes(x = gdpPercap, y = lifeExp, colour = continent, size = pop)) +
geom_point(alpha = 0.7) +
scale_x_log10(labels = scales::label_comma()) +
labs(
x = "GDP per capita (USD, log scale)",
y = "Life expectancy (years)",
colour = "Continent",
size = "Population"
) +
theme_minimal() +
theme(legend.position = "bottom")
```- Update the placeholder sentence in your Analysis section to describe what the figure shows. Use a cross-reference so Quarto links the text to the figure automatically:
@fig-gapminder shows that countries with higher GDP per capita tend to
have longer life expectancy. [Add one sentence on any pattern or outlier
you notice — e.g. which continent stands out, or whether the relationship
looks linear.]@fig-gapminder matches the label: fig-gapminder in your code chunk. Quarto replaces it with “Figure 1” (or the appropriate number) in the rendered output and turns it into a clickable link in HTML.
scale_x_log10() puts the x-axis on a log scale. This is important here because GDP per capita varies enormously across countries — the log scale makes the pattern much clearer. You will learn more about this choice in Session 2.
If you work in a codespace: The gapminder package may already be installed. Try running the chunk first; only run install.packages() if you get an error.
Check: When you render, a scatter plot appears in the Analysis section with a caption. The plot shows countries as points, coloured by continent.
Task 5: Bibliography
Add citations to your report using BibTeX.
Steps:
- The file
references.bibis already in your repository root — no need to copy anything. If you set up a project folder outside the Classroom repo, copy it in now. - In your Introduction, add a citation to Preston (1975), who first documented the income–life expectancy relationship systematically:
The relationship between income and life expectancy was first
systematically documented by @Preston1975.- In your Data section, add a citation to the course textbook for context on working with cross-country data:
For a discussion of cross-country data analysis, see @BekezKezdi2021.- Make sure the
## Referencesheading is at the end of your document — Quarto fills in the reference list automatically - Render to HTML and confirm the reference list appears
Check: The rendered HTML shows in-text citations (e.g. “Preston, 1975”) and a formatted reference list under the References heading.
Zotero integration (optional): Zotero can generate .bib entries automatically and sync with Quarto. This is covered in an optional tutorial after the session — not required today.
Task 6: Render to multiple formats
Confirm your document renders cleanly to HTML, Word, and PDF.
Steps:
- Render to HTML: Ctrl/Cmd + Shift + K (or click the Render button). The output appears in the Viewer pane or opens in your browser.
- Render to Word: in the Render button dropdown, select docx (or run
quarto render report.qmd --to docxin the Terminal). - Render to PDF: in the Render button dropdown, select pdf (or run
quarto render report.qmd --to pdfin the Terminal).
Rendering to PDF requires LaTeX. If you get an error, run tinytex::install_tinytex() in the R console and try again. If that is not an option today, skip PDF — HTML and Word are sufficient.
Check: Your project contains:
report.html— opens in a browser; shows TOC, numbered sections, figure, referencesreport.docx— opens in Word with the same structurereport.pdf— opens in a PDF viewer with the same structure
The .docx output may look slightly different from the HTML — that is normal. Word styling is polished in Session 8.
Finished early?
- Add a second chunk that filters to a different year (e.g. 1952) and compare the pattern to 2007 — has the relationship changed?
- Try
facet_wrap(~ continent)to split the plot by continent - Explore YAML options: what does
code-fold: truedo? What abouttheme: cosmo? - Read the Quarto documentation on HTML documents