R Example

DataFest! Spring 2025

Learn how to create data science reports using R with Quarto!

Author

Pete Benbow

Published

April 4, 2025

Setup

You know the drill! When working with R, we need to start by loading in whatever packages are required for the work we’re doing.

In this case, we need ggplot2 for data visualization (Wickham 2016), dplyr for data transformation (Wickham et al. 2023), and WDI will be our data source (Arel-Bundock 2022).

# Load the required packages
library(ggplot2)
library(dplyr)
library(WDI)

Exercises

Now that we’ve loaded our packages, let’s run through a few progressive exercises:

  1. Load the data
  2. Transform
  3. Visualize as chart
  4. Visualize as table

Ex 1: Get data

Let’s start by using the WDI package to call the World Bank’s public World Development Indicators API. We’ll load in data for all countries for the year 2023, and we’ll use the indicator NY.GDP.PCAP.KD, which represents GDP per capita in current US dollars.

Store the output as a new data frame named wb.

wb <- WDI(
    country   = "all",
    indicator = "NY.GDP.PCAP.KD",
    start     = 2023,
    end       = 2023,
    extra     = TRUE
) 

Ex 2: Transform

In this step, we will:

  1. Rename our GDP per capita measure to a more friendly name.
  2. Rank our countries by GDP per capita in descending order.
  3. Filter out any aggregate records.
  4. Filter to only the top 10 countries by their rank.

Store the output as a new data frame named wb_top10.

wb_top10 <- wb |>
    rename(gdp_percap = NY.GDP.PCAP.KD) |> 
    mutate(rank = rank(desc(gdp_percap))) |>
    filter(
        region != "Aggregates",
        rank <= 10
    )

Ex 3: Visualize as chart

Now let’s plot the data with ggplot2! Don’t forget to add descriptive labels, and reorder your X axis so the countries are presented in descending order by their GDP per capita.

Code
wb_top10 |>
    ggplot() + 
    geom_col(
        aes(
            x = reorder(country,gdp_percap),
            y = gdp_percap,
        ),
        fill = "#d42121"
    ) + 
    labs(
        title = "Top 10 countries by GDP per capita",
        x     = "Country",
        y     = "GDP per capita (current US dollars)"
    ) + 
    coord_flip() + 
    theme_minimal()

Ex 4: Visualize as table

Code
wb_top10 |>
    select(
        rank,
        country,
        gdp_percap,
        region,
        income,
        lending
    ) |> 
    arrange(rank) |> 
    knitr::kable()
rank country gdp_percap region income lending
1 Monaco 224582.45 Europe & Central Asia High income Not classified
2 Bermuda 110409.81 North America High income Not classified
3 Luxembourg 106342.76 Europe & Central Asia High income Not classified
4 Ireland 91647.77 Europe & Central Asia High income Not classified
5 Switzerland 89555.56 Europe & Central Asia High income Not classified
6 Cayman Islands 81411.65 Latin America & Caribbean High income Not classified
7 Norway 78912.33 Europe & Central Asia High income Not classified
8 Channel Islands 69822.95 Europe & Central Asia High income Not classified
9 United States 65875.18 North America High income Not classified
10 Singapore 65422.46 East Asia & Pacific High income Not classified

References

Arel-Bundock, Vincent. 2022. WDI: World Development Indicators and Other World Bank Data. https://CRAN.R-project.org/package=WDI.
Wickham, Hadley. 2016. Ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. https://ggplot2.tidyverse.org.
Wickham, Hadley, Romain François, Lionel Henry, Kirill Müller, and Davis Vaughan. 2023. Dplyr: A Grammar of Data Manipulation. https://CRAN.R-project.org/package=dplyr.