Configuring Jekyll, R Markdown and GitLab Pages

I’ve been experimenting with pushing R Markdown documents to my Jekyll GitLab Pages site. Steven Miller has written a great blog post about this, from which I’ve grabbed some YAML and R code.

The key is to write your post as an .Rmd saved in a suitably named non-_posts folder (I’ve named mine _source). Then you can knit an md_document into the Jekyll _posts directory via the following snippet in the YAML header. Just remember to set preserve_yaml: TRUE:

output:
  md_document:
    preserve_yaml: TRUE
knit: (function(inputFile, encoding) {
  rmarkdown::render(inputFile, encoding = encoding, output_dir = "../_posts") })

If you also want the R Markdown figure assets saved in a different folder (and you should!), you can do that in the setup chunk, like so. My fig_path is an assets folder that I’ll need to restructure at some point. I typically set echo = T in R Markdown, but like to suppress messages and warnings once I’ve run each chunk to verify that it works correctly.

knitr::opts_chunk$set(fig.path = fig_path,
                      cache.path = '../cache/',
                      echo = T, message = F, warning = F, cache = T) 

I haven’t yet automated the production of a Markdown file from the source .Rmd file with GitLab CI/CD, which means I have to knit the .Rmd myself on my local machine. But that doesn’t take so long.

To demonstrate the approach, here’s a graph depicting Apple Maps mobility data, which I’ve sourced from Kieran Healy’s excellent covdata package.

# remotes::install_github("kjhealy/covdata", force = T)
library(covdata)
library(tidyverse)

apple_mobility_ends <- apple_mobility %>% 
  filter(region == "Ottawa") %>% 
  group_by(transportation_type) %>% 
  filter(date == max(date)) %>% 
  pull(score) %>% 
  round(., 1)

apple_mobility %>% 
  filter(region == "Ottawa") %>% 
  ggplot() +
  geom_line(aes(date, score,
                group = transportation_type, colour = transportation_type),
            size = 1) +
  scale_x_date(date_break = "1 month", date_labels = "%B %e",
               expand = c(0, 0)) +
  scale_y_continuous(sec.axis = sec_axis(~., breaks = apple_mobility_ends)) +
  scale_colour_brewer(name = "Transportation type", palette = "Set2") +
  cowplot::theme_minimal_grid(14) +
  theme(legend.position = "bottom") +
  labs(x = "Date",
       y = "Relative index",
       title = "How have Ottawa's transportation levels changed in 2020?",
       caption = "Source: Apple Maps mobility data")

Transit usage has declined most significantly of the three transportation modes captured in Apple’s data, and it’s been the slowest to recover. More interesting to me is the fact that the recovery plateaued in early July at less than 40 percent of typical pre-pandemic ridership (or rather Ottawa ridership by Apple users). This is especially clear when we look at a simple STL decomposition of the transit time series. There were two days in May with missing data; originally I carried over the previous non-null index with tidyr::fill, which would have messed with the seasonality a bit. A more sophisticated method uses Rob Hyndman’s na.interp function:

library(forecast)
library(fable)
library(feasts)
library(tsibble)

apple_mobility %>%
  filter(region == "Ottawa") %>%
  filter(transportation_type == "transit") %>%
  mutate(score = forecast::na.interp(score)) %>% 
  as_tsibble() %>% 
  model(STL(score ~ season(period = 7))) %>% 
  components(.) %>% 
  autoplot() +
  scale_x_date(date_break = "1 month", date_labels = "%B %e") +
  cowplot::theme_minimal_grid(14) +
  labs(x = "Date",
       y = "Relative index",
       caption = "Source: Apple Maps mobility data")

This seems like an ideal jumping-off point for a future blog post—stay tuned.