Hi, all! This week’s Tidy Tuesday post is about nuclear explosions. Our data come from Stockholm International Peace Research Institute.
One of my goals for this week is to explore some of ggplot
’s advanced features, most of which are discussed in “Graphics for Communcation” from R for Data Science.
So, let’s try to create a visually appealing and informative plot!
Data Wrangling
Let’s read in the data and get started.
library(readr)
library(dplyr)
library(ggplot2)
library(forcats)
library(purrr)
library(stringr)
nuclear_explosions <- read_csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2019/2019-08-20/nuclear_explosions.csv")
First, let’s make some of the country
names look better.
nuclear_explosions <-
mutate(nuclear_explosions,
country = fct_recode(country,
France = "FRANCE",
China = "CHINA",
India = "INDIA",
Pakistan = "PAKIST"
))
Now, to put nice labels on our plot, we need a way to determine an appropriate height. Labels should be high enough that they don’t obstruct the bar plot. However, to save space, I want to put the legend inside the plot. So, we can’t place the labels too high. To solve this problem, we will find the number of explosions for each year, and write a function to find the largest number of yearly explosions over the 10 most recent years to any input.
(Thanks to Twitter user @msubbaiah1 for providing a short explination for the gap in testing in 1958 which saved us some Googling.)
yearlyExplosions <-
nuclear_explosions %>%
group_by(year) %>%
summarise(n = n())
getHeight <- function(year){
checkYears <- year + seq(-10, 0)
explosions <- filter(yearlyExplosions, year %in% checkYears) %>% select(n)
height <- max(explosions, na.rm = T) + 5
return(height)
}
dates <- tibble(year = c(1959, 1996),
text = map_chr(c("US, UK, and USSR form moratorium on nuclear testing from Nov '58 to Aug '61",
"Comprehensive Nuclear-Test-Ban Treaty signed in Sept '96"),
str_wrap, width = 20)) %>%
mutate(height = map_dbl(year, getHeight))
Data Visualization
ggplot(nuclear_explosions, aes(x = year, fill = fct_rev(fct_infreq(country)))) +
geom_bar(width = 1, color = "black") +
geom_segment(aes(yend = height, x = year, xend = year),
y = 0, data = dates, inherit.aes = F) +
geom_label(aes(x = year, y = height, label = text),
data = dates, inherit.aes = F, size = 2.5,
label.r = unit(0, "lines"), vjust = "bottom", hjust = "right") +
labs(title = "A History of Nuclear Explosions",
x = "Year", y = "Number of Nuclear Explosions",
caption = "Visualization: jackmwolf.rbind.io\nData: Stockholm International Peace Research Institute") +
theme_bw() +
theme(legend.justification=c(1,1), legend.position=c(1,1),
legend.background=element_blank(),
panel.grid.major.x = element_blank(),
panel.grid.minor.x = element_blank()) +
guides(fill = guide_legend(nrow = 2, reverse = TRUE)) +
scale_fill_brewer(palette = "Accent", name = "") +
scale_y_continuous(expand = expand_scale(mult = c(0, 0.05))) +
scale_x_continuous(breaks = seq(1945, 2000, by = 5))
Reflections
I used a lot of new functions this week (str_wrap()
, geom_segment()
, geom_label()
, and several others)!
This was also my first time using purrr
instead of the apply
family of functions.
I love how easy it is to use, and how intuitive it feels—I will definitely start to use it more in my work.
It was fun to put so much effort into one plot, and I enjoyed exploring the wide range of options that ggplot
offers.
Thanks for reading! I’ll see you all next week.