Running tidy t-tests with the infer package in R

Packages we will need:

library(tidyverse)
library(tidyr)
library(infer)
library(bbplot)
library(ggthemes)

For this t-test, we will compare US millenials and non-millenials and their views of the UK’s influence in the world.

The data will come from Chicago Council Survey of American Public Opinion on U.S. Foreign Policy

Click here to download 2017 policy survey data

The survey investigates American public opinion on foreign policy. It focuses on respondents’ opinions of the United States’ leadership role in the world and the challenges the country faces domestically and internationally.

The question on the UK’s influence asks how much influence you think the UK has in the world. Please answer on a 0 to 10 scale; with 0 meaning they are not at all influential and 10 meaning they are extremely influential.

First we select and recreate the variables

fp %>%
  select(
    milennial = XMILLENIALSSAMPLEFLAG,
    uk_influence = Q50_10) %>%
  separate(
    col = milennial,
    into = c("milennial_num", "milennial_char"),
    sep = '[)]',
    remove = TRUE) %>% 
  mutate(
     uk_influence = as.character(uk_influence),
     uk_influence = parse_number(uk_influence)) %>% 
  filter(uk_influence != -1) %>% 
  tidyr::drop_na(milennial_char) -> mil_fp

With the infer package, we can run a t-test:

mil_fp %>% 
  t_test(formula = uk_influence ~ milennial_char,
         alternative = "less")%>% 
  kable(format = "html")
statistic t_df p_value alternative estimate lower_ci upper_ci
-3.048249 1329.469 0.0011736 less -0.3274509 -Inf -0.1506332

There is a statistically significant difference between milennials and non-milennials.

We can graph a box plot.

mil_fp %>% 
  ggplot(mapping = aes(x = milennial_char,
                       y = uk_influence,
                       fill = milennial_char)) +
  geom_jitter(aes(color = milennial_char),
              size = 2, alpha = 0.5, width = 0.3) +
  geom_boxplot(alpha = 0.4) +
  coord_flip() + bbplot::bbc_style() +
  scale_fill_manual(values = my_palette) + 
  scale_color_manual(values = my_palette)

And a quick graph to compare UK with other countries: Germany and South Korea

mil_fp %>% 
  select(milennial_char, uk_influence, sk_influence, ger_influence) %>% 
  pivot_longer(!milennial_char, names_to = "survey_question", values_to = "response")  %>% 
  group_by(survey_question, response) %>% 
  summarise(n = n()) %>%
  mutate(freq = n / sum(n)) %>% 
  ungroup() %>% 
  filter(!is.na(response)) %>% 
  mutate(survey_question = case_when(survey_question == "uk_influence" ~ "UK",
survey_question == "ger_influence" ~ "Germany",
survey_question == "sk_influence" ~ "South Korea",
TRUE ~ as.character(survey_question))) %>% 
  ggplot() +
  geom_bar(aes(x = forcats::fct_reorder(survey_question, freq), 
               y = freq, fill = as.factor(response)), 
           color = "#e5e5e5", 
           size = 2, 
           position = "stack",
           stat = "identity") + 
  coord_flip() + 
  scale_fill_brewer(palette = "RdBu") + 
  ggthemes::theme_fivethirtyeight() + 
  ggtitle("View of Influence in the world?") +
  theme(legend.title = element_blank(),
        legend.position = "top",
        legend.key.size = unit(0.78, "cm"),
        text = element_text(size = 25),
        legend.text = element_text(size = 20))