Skip to content

Instantly share code, notes, and snippets.

@kevinschaul
Last active August 18, 2022 20:46
Show Gist options
  • Save kevinschaul/dcabd2cdd5ce20c540bac82512494fcb to your computer and use it in GitHub Desktop.
Save kevinschaul/dcabd2cdd5ce20c540bac82512494fcb to your computer and use it in GitHub Desktop.
R stats cheat sheet

R stats/tidyverse cheat sheet

Because I always forget how to do the same things, over and over again

See also: Pandas cheat sheet

Helpful links:

Working with JSON

library(jsonlite)

Read a JSON file

data <- fromJSON('path/to/file.json')

Nested JSON data

Sometimes you want to flatten:

data <- fromJSON('path/to/file.json', flatten=T)

Sometimes you want to unnest:

data <- fromJSON('path/to/file.json') %>%
  unnest(colName)

If you want to unnest but the column to unnest by is sometimes null:

data <- fromJSON('path/to/file.json') %>%
  filter(!map_lgl(colName, is.null)) %>%
  unnest(colName)

Source

Write a JSON file

write(toJSON(data, pretty=T), 'path/to/file.json')

Data transformation

Add rows to a dataframe

all <- bind_rows(some_data, more_data)

Extract part of a string with regex

library(stringr)
str_match("MY TEXT to match", "([A-Z]+) to match")[2]

Extract part of a string with regex within a mutate

data %>%
  extract(col, c('area_code', 'rest_of_number'), '([0-9]{3})-([0-9]{3}-[0-9]{4})')

Mutate using a custom, non-vectorized function

data %>%
  mutate(
    newCol = mapply(function(x) customFunction(x), oldCol)
  )

Reference a variable with a space inside dplyr

data %>%
  mutate(
    newCol = `old col`
  )

Select a single column, turning it into a list

data %>%
  pull(col)

Working with dates

Calculate age based on date of birth

data %>%
  mutate(age=(today() - ymd(birthday)) / 365)

Create a range of dates

seq(as.Date('1950-01-20'), as.Date('2019-01-20'), 'years')

Loop through a range of dates

days <- seq(as.Date('2019-03-01 '), as.Date('2019-04-01'), 'days')
for (i in seq_along(days)) {
  print(days[i])
}

ggplot tricks

Use Post colors for parties

... + scale_fill_manual(values=c('D'='#3579a8', 'R'='#dc5147'))

Manually adjust x scale

scale_x_continuous(limits = c(-1, 1))

Parse Cook ratings as factor

data %>%
  mutate(
    cookRating = parse_factor(cookRating, c('SOLID D', 'LIKELY D', 'LEAN D', 'TOSSUP D', 'TOSSUP R', 'LEAN R', 'LIKELY R', 'SOLID R'))
  )

Add Cook political rating colors

scale_fill_manual(values=c('SOLID D'='#0082c6', 'LIKELY D'='#67b8e6', 'LEAN D'='#b5d7f2', 'TOSSUP D'='#c9a9c7', 'TOSSUP R'='#c9a9c7', 'LEAN R'='#fabfbf', 'LIKELY R'='#f48587', 'SOLID R'='#ed1c24'))

U.S. chartograms With geofacet

library(geofacet)
facet_geo(~state, grid="us_state_grid2")

Maps

U.S. map using maps and mapdata packages

library(maps)
library(mapdata)

states_geo <- map_data('state')
ggplot(states_geo) + geom_polygon(aes(x=long, y=lat, fill=region, group=group), color="white") + coord_fixed(1.3)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment