How to automate panel data modelling with dynamic formulas in R
Packages we will need: First we create our panel regression function. A plm takes in many parts including the formula, the data and the model type, for example. Here we can change the formula of independent variables that we can plug into the model We type in the specific model formulas using paste We feed…
How to improve graphs with themes and palettes: Top packages in R
In this blog, we can look at ways to make our plots and graphs more appealing to the eye. Before we go about working on the aesthetics, let’s build and save a typical political science graph. We will examine the inverted U shape between democracy and level of mass mobilization across six different regions. The…
How to download and animate the Varieties of Democracy (V-DEM) dataset in R
In this blog post, we will download the V-DEM datasets with their vdemdata package. It is still in development, so we will use the install_github() function from the devtools package And really quickly we can download the dataset with one line of code We can use the find_var function to get information on variables based…
Tips and code snippets to improve ggplot graphs and plots in R
Some code snippets to improve graph appearance and readability! Compare the first basic graph with the second more informative graph. Dealing with the z and y axes can be a pain. In this code: The breaks argument of scale_y_continuous() is set using a custom function that takes limits as input (which represents the range of…
How to only label the outliers in a ggplot graph with R
Another blog I will make to have easy access to code snippets for my own record. We will use an example with data from V-DEM. Click here to read more about downloading the V-DEM dataset v2x_jucon: To what extent does the executive respect the constitution and comply with court rulings, and to what extent is…
How to rowwise sum the variables that contain the same variable string pattern in R
This is another blog post so that I can keep a snippet of code for myself! And if you find it helpful too, all the better. We will be completing rowwise computations, which is not the default in R. Therefore, we need to explicitly state that is what we are hoping to do In this…
Removing variables from V-DEM according to string suffixes
In this blog, I just want to keep the code that removes the Varieties of Democracy variables that are not the continuous variables and the run exploratory correlation analysis. Click here to read more about downloading the V-DEM dataset directly into R via the vdemdata package in R Click here to download the V-DEM dataset…
How to run multiple t-tests in a function with the broom package in R
Packages we will need: We will use the Varieties of Democracy dataset again. We will use a t-test comparing democracies (boix == 1) and non-democracies (boix == 0) in the years 2000 to 2020. We need to remove the instances where boix is NA. I choose three t-tests to run simultaneously. Comparing democracies and non-democracies…
How to use the assign() function in R
We can use the assign function to create new variables. Most often I want to assign variables that I create to the Global Environment. assign particularly useful in loops, simulations, and scenarios involving conditional variable naming or creation. The basic syntax of the assign function is assign(x, value, pos = -1, envir = as.environment(pos), inherits…
How to run cross-validation of decision-tree models with xgboost in R (PART 4 Tidymodels series)
In this blog post, we will cross-validate different boosted tree models and find the one with best root mean square error (RMSE). Specifically, part 2 goes into more detail about RMSE as a way to choose the best model Click here to read part 1, part 2 or part 3 of this series on tidymodel…
How to run decision tree analysis with xgboost in R (Tidymodels Series PART 3)
Packages we will need: In this blog post, we are going to run boosted decision trees with xgboost in tidymodels. Boosted decision trees are a type of ensemble learning technique. Ensemble learning methods combine the predictions from multiple models to create a final prediction that is often more accurate than any single model’s prediction. The…
How to run linear regression analysis with tidymodels in R for temporal prediction (Tidymodels Series PART 2)
Packages we will need: We will look at Varieties of Democracy dataset We will create two datasets: one for all years EXCEPT 2020 and one for only 2020 First we build the model. We will look at whether level of public sector theft can predict the judicial corruption levels. The model will have three parts…
How to run regressions with the tidymodels package in R: PART 1
The tidymodels framework in R is a collection of packages for modeling. Within tidymodels, the parsnip package is primarily responsible for specifying models in a way that is independent of the underlying modeling engines. The set_engine() function in parsnip allows users to specify which computational engine to use for modeling, enabling the same model specification…
How to use the mget() function in R
The mget() fuction is a multiple get() function We use mget() to retrieve multiple objects by their names I have found this helpful when I want to perform operations on many df (with similar names) without having to type out each name. For example, I can create four data.frames. They all have similar name patterns.…
Random coding tips I always forget: 50+ tips for tidyverse, purrr, stringr, lubridate, janitor and other packages
Packages we will need: I use this post to keep code bits all in one place so I can check back here when I inevitably forget them. For most of the snippets, we can use a map data.frame that we can download from the rnaturalearth package. So the code below downloads a map of the…
How to graph model variables with the tidy package in R
Packages we will need: We will make a linear regression model and graph the coefficients to show which variables are statistically significant in the regression with ggplot. First we will download some variables from the World Bank Indicators package. Click here to read more about the WDI package. We will use Women Business and the…
How to graph proportions with the waffle and treemapify packages in R
Packages we will need: In this blog, we will look at visualising proportions in a few lines. I have some aid data and I want to see what proportion of the aid does not have a theme category. This can be useful to visualise incomplete data across years or across categories. First, we can make…
How to graph bubble charts and treemap charts in R
Packages we will need: In this blog, we will look at different types of charts that we can run in R. Both the bubble and treemap charts are simple to run. Before we begin, we will choose some hex colors for the palette. I always use the coolors palettes website to find nice colours. First,…
How to analyse Afrobarometer survey data with R. PART 3: Cronbach’s Alpha, Exploratory Factor Analysis and Correlation Matrices
Packages we will need: How do people view the state of the economy in the past, present and future for the country and how they view their own economic situation? Are they highly related concepts? In fact, are all these questions essentially asking about one thing: how optimistic or pessimistic a person is about the…
How to calculate a linguistic Herfindahl-Hirschman Index (HHI) with Afrobarometer survey data in R PART 2
Packages we will need: In this blog, we will look at calculating a variation of the Herfindahl-Hirschman Index (HHI) for languages. This will give us a figure that tells us how diverse / how concentrated the languages are in a given country. We will continue using the Afrobarometer survey in the post! Click here to…