Packages we will need:
library(tidyverse) library(forcats) library(ggthemes)
We are going to look at a few questions from the 2019 US Pew survey on relations with foreign countries.
Data can be found by following this link:
We are going to make bar charts to plot out responses to the question asked to American participaints: Should the US cooperate more or less with some key countries? The countries asked were China, Russia, Germany, France, Japan and the UK.
Before we dive in, we can find some nice hex colors for the bar chart. There are four possible responses that the participants could give: cooperate more, cooperate less, cooperate the same as before and refuse to answer / don’t know.
pal <- c("Cooperate more" = "#0a9396", "Same as before" = "#ee9b00", "Don't know" = "#005f73", "Cooperate less" ="#ae2012")
We first select the questions we want from the full survey and pivot the dataframe to long form with
pivot_longer(). This way we have a single column with all the different survey responses. that we can manipulate more easily with
Then we summarise the data to count all the survey reponses for each of the four countries and then calculate the frequency of each response as a percentage of all answers.
Then we mutate the variables so that we can add flags. The
geom_flag() function from the ggflags packages only recognises ISO2 country codes in lower cases.
After that we change the factors level for the four responses so they from positive to negative views of cooperation
pew %>% select(id = case_id, Q2a:Q2f) %>% pivot_longer(!id, names_to = "survey_question", values_to = "response") %>% group_by(survey_question, response) %>% summarise(n = n()) %>% mutate(freq = n / sum(n)) %>% ungroup() %>% mutate(response_factor = as.factor(response)) %>% mutate(country_question = ifelse(survey_question == "Q2a", "fr", ifelse(survey_question == "Q2b", "gb", ifelse(survey_question == "Q2c", "ru", ifelse(survey_question == "Q2d", "cn", ifelse(survey_question == "Q2e", "de", ifelse(survey_question == "Q2f", "jp", survey_question))))))) %>% mutate(response_string = ifelse(response_factor == 1, "Cooperate more", ifelse(response_factor == 2, "Cooperate less", ifelse(response_factor == 3, "Same as before", ifelse(response_factor == 9, "Don't know", response_factor))))) %>% mutate(response_string = fct_relevel(response_string, c("Cooperate less","Same as before","Cooperate more", "Don't know"))) -> pew_clean
We next use ggplot to plot out the responses.
We use the
position = "stack" to make all the responses “stack” onto each other for each country. We use
stat = "identity" because we are not counting each reponses. Rather we are using the
freq variable we calculated above.
pew_clean %>% ggplot() + geom_bar(aes(x = forcats::fct_reorder(country_question, freq), y = freq, fill = response_string), color = "#e5e5e5", size = 3, position = "stack", stat = "identity") + geom_flag(aes(x = country_question, y = -0.05 , country = country_question), color = "black", size = 20) -> pew_graph
And last we change the appearance of the plot with the theme function
pew_graph + coord_flip() + scale_fill_manual(values = pal) + ggthemes::theme_fivethirtyeight() + ggtitle("Should the US cooperate more or less with the following country?") + theme(legend.title = element_blank(), legend.position = "top", legend.key.size = unit(2, "cm"), text = element_text(size = 25), legend.text = element_text(size = 20), axis.text = element_blank())