From Kieran Healy’s blog post. My R Markdown file for this example is here.
Mapping data (and, more importantly, converting map data into a format that we can tidily use) requires a bit more heavy lifting behind the scenes than the core tidyverse libraries provide. We start by (if necessary) loading the sf library, which will also bring a number of dependencies with it. We’ll also load a few packages that provide some functions we might use in the conversion and mapping process.
library(geojsonio)
library(rmapshaper)
library(rgdal)
library(tidyverse)
library(spdplyr)
library(sf)
We make a function, theme_map(), that will be our ggplot theme. It consists mostly in turning off pieces of the plot (like axis text and so on) that we don’t need.
theme_map <- function(base_size=9, base_family="") {
require(grid)
theme_bw(base_size=base_size, base_family=base_family) %+replace%
theme(axis.line=element_blank(),
axis.text=element_blank(),
axis.ticks=element_blank(),
axis.title=element_blank(),
panel.background=element_blank(),
panel.border=element_blank(),
panel.grid=element_blank(),
panel.spacing=unit(0, "lines"),
plot.background=element_blank(),
legend.justification = c(0,0),
legend.position = c(0,0)
)
}
theme_set(theme_map())
Next we need to get the actual map data. This is often the hardest bit. In the case of Canada, their central statistics agency provides map shape files that we can use. The Shape File format (.shp) is the most widely-used standard for maps. We’ll need to grab the files we want and then convert them to a format the tidyverse can use. You won’t be able to run the code in the next few chunks unless the files are downloaded where the code expects them to be.
So, get the data from Statistics Canada. We’re going to use this link: http://www12.statcan.gc.ca/census-recensement/2011/geo/bound-limit/bound-limit-2011-eng.cfm. From the linked page, choose as your options ArcGIS .shp file, then—for example—Census divisions and cartographic boundaries. You’ll then download a zip file. Expand this zip file into the directory in your working folder named ‘data’. Then we can import the shapefile using one of rgdal’s workhorse functions, readOGR.
canada_raw <- readOGR(dsn = "data/gcd_000b11a_e",
layer = "gcd_000b11a_e",
use_iconv=TRUE,
encoding="CP1250")
## OGR data source with driver: ESRI Shapefile
## Source: "/Users/judgelord/811/data/gcd_000b11a_e", layer: "gcd_000b11a_e"
## with 293 features
## It has 5 fields
Note the options here. This information is contained in the documentation for the shape files. Now we’re going to convert this object to GeoJSON format and simplify the polygons. These steps will take a little while to run:
canada_raw_json <- geojson_json(canada_raw)
canada_raw_sim <- ms_simplify(canada_raw_json)
Then we save the resulting GeoJSON file, which we can now work with directly from here on out:
geojson_write(canada_raw_sim, file = here("data/canada_cd_sim.geojson"))
## <geojson-file>
## Path: /Users/judgelord/811/data/canada_cd_sim.geojson
## From class: json
We only need to do the importing, converting, and thinning once. If you already have a GeoJSON file, you can just start here.
Read the GeoJSON file back in as an sf object. The .geojson file is included in the repository, so you can load the libraries listed above and then start from here if you want.
canada_cd <- st_read(here("data/canada_cd_sim.geojson"), quiet = TRUE)
canada_cd %>%
ggplot() +
geom_sf(mapping = aes(fill = PRNAME)) +
guides(fill = FALSE)
head(canada_cd)
## Simple feature collection with 6 features and 6 fields
## geometry type: MULTIPOLYGON
## dimension: XY
## bbox: xmin: -124.855 ymin: 48.5212 xmax: -97.3315 ymax: 60
## epsg (SRID): 4326
## proj4string: +proj=longlat +datum=WGS84 +no_defs
## CDUID CDNAME CDTYPE PRUID
## 1 4609 Division No. 9 CDR 46
## 2 5901 East Kootenay RD 59
## 3 5933 Thompson-Nicola RD 59
## 4 4816 Division No. 16 CDR 48
## 5 5919 Cowichan Valley RD 59
## 6 4621 Division No. 21 CDR 46
## PRNAME rmapshaperid
## 1 Manitoba 0
## 2 British Columbia / Colombie-Britannique 1
## 3 British Columbia / Colombie-Britannique 2
## 4 Alberta 3
## 5 British Columbia / Colombie-Britannique 4
## 6 Manitoba 5
## geometry
## 1 MULTIPOLYGON (((-97.9474 50...
## 2 MULTIPOLYGON (((-114.573 49...
## 3 MULTIPOLYGON (((-120.1425 5...
## 4 MULTIPOLYGON (((-110 60, -1...
## 5 MULTIPOLYGON (((-123.658 48...
## 6 MULTIPOLYGON (((-99.0172 55...
Notice the proj4string there in the metadata. Let us change that to Canada’s favorite projection, the Lambert Conformal Conic projection.
canada_cd %<>% st_transform(crs = "+proj=lcc +lat_1=49 +lat_2=77 +lon_0=-91.52 +x_0=0 +y_0=0 +datum=NAD83 +units=m +no_defs")
head(canada_cd)
## Simple feature collection with 6 features and 6 fields
## geometry type: MULTIPOLYGON
## dimension: XY
## bbox: xmin: -2347179 ymin: 6675921 xmax: -384292.8 ymax: 8032994
## epsg (SRID): NA
## proj4string: +proj=lcc +lat_1=49 +lat_2=77 +lat_0=0 +lon_0=-91.52 +x_0=0 +y_0=0 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs
## CDUID CDNAME CDTYPE PRUID
## 1 4609 Division No. 9 CDR 46
## 2 5901 East Kootenay RD 59
## 3 5933 Thompson-Nicola RD 59
## 4 4816 Division No. 16 CDR 48
## 5 5919 Cowichan Valley RD 59
## 6 4621 Division No. 21 CDR 46
## PRNAME rmapshaperid
## 1 Manitoba 0
## 2 British Columbia / Colombie-Britannique 1
## 3 British Columbia / Colombie-Britannique 2
## 4 Alberta 3
## 5 British Columbia / Colombie-Britannique 4
## 6 Manitoba 5
## geometry
## 1 MULTIPOLYGON (((-457449.9 6...
## 2 MULTIPOLYGON (((-1628202 69...
## 3 MULTIPOLYGON (((-1838098 74...
## 4 MULTIPOLYGON (((-988280.5 7...
## 5 MULTIPOLYGON (((-2253700 71...
## 6 MULTIPOLYGON (((-461578.4 7...
Finally, and just because I don’t have any census-district-level data at the moment, we make a vector of repeated colors to fill in the map, for decoration only, if you want to color all the census divisions.
map_colors <- RColorBrewer::brewer.pal(8, "Pastel1")
map_colors <- rep(map_colors, 37)
Instead, we’ll just map the fill to PRUID, i.e. the province level. But try mapping fill to CDUID instead (the census district ID), and see what happens.
canada_cd %>%
ggplot(mapping = aes(fill = PRUID)) + # provence-level unit ID
geom_sf( size = 0.1) +
scale_fill_manual(values = map_colors) +
guides(fill = FALSE) +
theme_map() +
theme(panel.grid.major = element_line(color = "white"),
legend.key = element_rect(color = "gray40", size = 0.1))