IAFF 6501 – Merging and Summarizing Data

Choropleth Maps

Choropleth maps are shaded maps that show variation in a variable across geographic space
Now that you have a handle on how to merge data, you should be able to make one!

Choropleth Map

The `rnaturalearth` package

rnaturalearth is a package that provides access to shapefiles for countries, states, and provinces
Uses the Natural Earth dataset which features the “natural earth” projection
Contrasts with Mercator projection used by Google Maps, etc.
Also uses simple features (sf) dataframes
- A new way of storing spatial data in R
- Allows for easy storage, manipulation and plotting

Mercator Projection

Source: Wikipedia

Natural Earth Projection

Source: Wikipedia

Simple Features

Map Code

Grab country shapes with ne_countries()

library(rnaturalearth)
library(dplyr)

world_map_df <- ne_countries(scale = "medium", returnclass = "sf") |>
    filter(name != "Antarctica") # remove Antarctica

#world_map_df |>
#glimpse()

# view contents of geometry column
world_map_df |>
  select(geometry)

Basic Choropleth Map

Make a map using geom_sf() from ggplot2.

library(ggplot2)

ggplot(data = world_map_df) +
  geom_sf(aes(fill = income_grp)) + 
  labs(title = "World Bank country income categories")

That gives us…

Beautiful Map

Change label of legend with fill=, add viridis color scheme and change theme with theme_map() from ggthemes.

library(ggthemes)

ggplot(data = world_map_df) +
  geom_sf(aes(fill = income_grp)) + 
  labs(
    title = "World Bank country income categories",
    fill = "Category"
    ) +
    scale_fill_viridis_d() +
    theme_map() library(ggthemes)

ggplot(data = world_map_df) +
  geom_sf(aes(fill = income_grp)) + 
  labs(
    title = "World Bank country income categories",
    fill = "Category"
    ) +
    scale_fill_viridis_d() +
    theme_map() library(ggthemes)

ggplot(data = world_map_df) +
  geom_sf(aes(fill = income_grp)) + 
  labs(
    title = "World Bank country income categories",
    fill = "Category"
    ) +
    scale_fill_viridis_d() +
    theme_map() library(ggthemes)

ggplot(data = world_map_df) +
  geom_sf(aes(fill = income_grp)) + 
  labs(
    title = "World Bank country income categories",
    fill = "Category"
    ) +
    scale_fill_viridis_d() +
    theme_map()

And now we have…

Your Turn!

Make a map of WB income categories
Grab country shapes and store data in an object
Use geom_sf() to make the map
Style the map with labs() and scale_fill_viridis_d()
Try mapping a different variable (check on this)

05:00

Map Other Data

Grab data from the WB, join with country shapes…

# Load wbstats
library(wbstats)

# Grab oil rents data
oil_rents_df <- wb_data(c(oil_rents_gdp = "NY.GDP.PETR.RT.ZS"), mrnev = 1) 

# Join with country shapes
rents_map_df <- left_join(world_map_df, oil_rents_df, join_by(iso_a3 == iso3c))

# Have a look at the special features column
rents_map_df |>
  select(last_col(5):last_col()) |> #select last 5 columns of df
  glimpse()# Load wbstats
library(wbstats)

# Grab oil rents data
oil_rents_df <- wb_data(c(oil_rents_gdp = "NY.GDP.PETR.RT.ZS"), mrnev = 1) 

# Join with country shapes
rents_map_df <- left_join(world_map_df, oil_rents_df, join_by(iso_a3 == iso3c))

# Have a look at the special features column
rents_map_df |>
  select(last_col(5):last_col()) |> #select last 5 columns of df
  glimpse()# Load wbstats
library(wbstats)

# Grab oil rents data
oil_rents_df <- wb_data(c(oil_rents_gdp = "NY.GDP.PETR.RT.ZS"), mrnev = 1) 

# Join with country shapes
rents_map_df <- left_join(world_map_df, oil_rents_df, join_by(iso_a3 == iso3c))

# Have a look at the special features column
rents_map_df |>
  select(last_col(5):last_col()) |> #select last 5 columns of df
  glimpse()# Load wbstats
library(wbstats)

# Grab oil rents data
oil_rents_df <- wb_data(c(oil_rents_gdp = "NY.GDP.PETR.RT.ZS"), mrnev = 1) 

# Join with country shapes
rents_map_df <- left_join(world_map_df, oil_rents_df, join_by(iso_a3 == iso3c))

# Have a look at the special features column
rents_map_df |>
  select(last_col(5):last_col()) |> #select last 5 columns of df
  glimpse()

Map Other Data

ggplot(data = rents_map_df) +
  geom_sf(aes(fill = oil_rents_gdp)) + # shade based on oil rents
  labs(
    title = "Oil rents (% of GDP)",
    subtitle = "(Most recent available data)", # add subtitle
    fill = "Percent", 
    caption = "Source: World Bank Development Indicators"
    ) +
  theme_map() +
  theme(
    legend.position = "right", 
    plot.title = element_text(face = "bold"), # move legend
    ) +
  scale_fill_viridis_c( # chg from discrete (_d) to continuous (_c)
      option = "magma", #  chg to magma theme
      labels = scales::label_percent(scale = 1) # add % label for legend
      ) ggplot(data = rents_map_df) +
  geom_sf(aes(fill = oil_rents_gdp)) + # shade based on oil rents
  labs(
    title = "Oil rents (% of GDP)",
    subtitle = "(Most recent available data)", # add subtitle
    fill = "Percent", 
    caption = "Source: World Bank Development Indicators"
    ) +
  theme_map() +
  theme(
    legend.position = "right", 
    plot.title = element_text(face = "bold"), # move legend
    ) +
  scale_fill_viridis_c( # chg from discrete (_d) to continuous (_c)
      option = "magma", #  chg to magma theme
      labels = scales::label_percent(scale = 1) # add % label for legend
      ) ggplot(data = rents_map_df) +
  geom_sf(aes(fill = oil_rents_gdp)) + # shade based on oil rents
  labs(
    title = "Oil rents (% of GDP)",
    subtitle = "(Most recent available data)", # add subtitle
    fill = "Percent", 
    caption = "Source: World Bank Development Indicators"
    ) +
  theme_map() +
  theme(
    legend.position = "right", 
    plot.title = element_text(face = "bold"), # move legend
    ) +
  scale_fill_viridis_c( # chg from discrete (_d) to continuous (_c)
      option = "magma", #  chg to magma theme
      labels = scales::label_percent(scale = 1) # add % label for legend
      ) ggplot(data = rents_map_df) +
  geom_sf(aes(fill = oil_rents_gdp)) + # shade based on oil rents
  labs(
    title = "Oil rents (% of GDP)",
    subtitle = "(Most recent available data)", # add subtitle
    fill = "Percent", 
    caption = "Source: World Bank Development Indicators"
    ) +
  theme_map() +
  theme(
    legend.position = "right", 
    plot.title = element_text(face = "bold"), # move legend
    ) +
  scale_fill_viridis_c( # chg from discrete (_d) to continuous (_c)
      option = "magma", #  chg to magma theme
      labels = scales::label_percent(scale = 1) # add % label for legend
      ) ggplot(data = rents_map_df) +
  geom_sf(aes(fill = oil_rents_gdp)) + # shade based on oil rents
  labs(
    title = "Oil rents (% of GDP)",
    subtitle = "(Most recent available data)", # add subtitle
    fill = "Percent", 
    caption = "Source: World Bank Development Indicators"
    ) +
  theme_map() +
  theme(
    legend.position = "right", 
    plot.title = element_text(face = "bold"), # move legend
    ) +
  scale_fill_viridis_c( # chg from discrete (_d) to continuous (_c)
      option = "magma", #  chg to magma theme
      labels = scales::label_percent(scale = 1) # add % label for legend
      ) ggplot(data = rents_map_df) +
  geom_sf(aes(fill = oil_rents_gdp)) + # shade based on oil rents
  labs(
    title = "Oil rents (% of GDP)",
    subtitle = "(Most recent available data)", # add subtitle
    fill = "Percent", 
    caption = "Source: World Bank Development Indicators"
    ) +
  theme_map() +
  theme(
    legend.position = "right", 
    plot.title = element_text(face = "bold"), # move legend
    ) +
  scale_fill_viridis_c( # chg from discrete (_d) to continuous (_c)
      option = "magma", #  chg to magma theme
      labels = scales::label_percent(scale = 1) # add % label for legend
      )

Your Turn!

Try mapping a favorite variable from the World Bank
First, download the relevant data using wbstats
Then merge it with your country shapes
Map using geom_sf()
Beautify your map!

05:00

Map Some V-Dem Data

Now try mapping some V-Dem data
Remind yourself of how to download data from V-Dem
You will have to convert country codes to iso3c
Then merge with country shapes
Then map your V-Dem indicator!

05:00

Merging and Summarizing Data

Merging Data

Merging Data Frames

Merging WB and V-Dem Data

`countrycode` Example

Try it Yourself

Types of Joins in `dplyr`

`left_join()` Example

Try it Yourself

Group, Summarize and Arrange

Group, Summarize and Arrange

Example: Take Averages by Region

Try it Yourself

Choropleth Maps

Choropleth Maps

Choropleth Map

Choropleth Map

The `rnaturalearth` package

Mercator Projection

Natural Earth Projection

Simple Features

Map Code

Basic Choropleth Map

Beautiful Map

Your Turn!

Map Other Data

Map Other Data

Map Other Data

Your Turn!

Map Some V-Dem Data