A nice way to summarise all the variables in a dataset.
install.packages("skimr")
library(skimr)
The data we’ll look at is from the Correlates of War . It provides dyadic records of militarized interstate disputes (MIDs) over the period of 1816-2010.
skim(mid)
n_missing :
tells which variables have missing values
complete_rate :
the percentage of the variables which are missing
Column 4 – 7 gives the mean, standard deviation, min, 25th percentile, median, 75th percentile
and max
values.
The last column is a histogram of each variables, so you can easily scan and see if variables are normally distributed, skewed or binary.
