Include country labels to a regression plot with ggplot2 package in R

Sometimes the best way to examine the relationship between our variables of interest is to plot it out and give it a good looking over. For me, it’s most helpful to see where different countries are in relation to each other and to see any interesting outliers.

For this, I can use the geom_text() function from the ggplot2 package.

I will look at the relationship between economic globalization and social globalization in OECD countries in the year 2000.

The KOF Globalisation Index, introduced by Dreher (2006) measures globalization along the economicsocial and political dimension for most countries in the world

First, as always, we install and load the necessary package. This time, it is the ggplot2 package

install.packages("ggplot2")
library(ggplot2)

Next add the following code:

fin <- ggplot(oecd2000, aes(economic_globalization, social_globalization)) 
        + ggtitle("Relationship between Globalization Index Scores among OECD countries in 2000")
        + scale_x_continuous("Economic Globalization Index")
        + scale_y_continuous("Social Globalization Index") 
        + geom_smooth(method = "lm") 
        + geom_point(aes(colour = polity_score), size = 2) + labs(color = "Polity Score")
        + geom_text(hjust = 0, nudge_x = 0.5, size = 4, aes(label = country)) 

fin 

In the aes() function, we enter the two variables we want to plot.

Then I use the next three lines to add titles to axes and graph

I use the geom_smooth() function with the “lm” method to add a best fitting regression line through the points on the plot. Click here to learn more about adding a regression line to a plot.

I add a legend to examine where countries with different democracy scores (taken from the Polity Index) are located on the globalization plane. Click here to learn about adding legends.

The last line is the geom_text() function that I use to specify that I want to label each observation (i.e. each OECD country) with its name, rather than the default dataset number.

Some geom_text() commands to use:

  • nudge_x (or nudge_y) slightly “nudge” the labels from their corresponding points to help minimise messy overlapping.
  • hjust and vjust move the text label “left”, “center”, “right”, “bottom”, “middle” or “top” of the point.

Yes, yes! There is a package that uses the color palettes of Wes Anderson movies to make graphs look just beautiful. Click here to use different Wes Anderson aesthetic themed graphs!

zissou_colors <- wes_palette("Zissou1", 100, type = "continuous")

fin + scale_color_gradientn(colours = zissou_colors)

Which outputs:

Interestingly, it seems that at the very bottom left hand corner of the plot (which shows the countries that are both low in economic globalization and low in social globalization), we have two OECD countries that score high on democracy – Japan and South Korea- right next to two countries that score the lowest in the OECD on democracy, Turkey and Mexico.

So it could be interesting to further examine why these countries from opposite ends of the democracy spectrum have similar pattern of low globalization. It puts a spanner in the proverbial works with my working theory that countries higher in democracy are more likely to be more globalized! What is special about these two high democracy countries that gives them such low scores on globalization.

Create facetted scatterplots with the ggplot2 package in R

If I want to graphically display the relationship between two variables, the ggplot2 package is a very handy way to produce graphs.

For example, I can use the ggplot2 package to graphically examine the relationship between civil society strength and freedom of citizens from torture. Also I can see whether this relationship is the same across regime types.

I choose one year from my dataframe to examine.

data2000 <- myPanel[which(myPanel$year == "2000"),]

Next, I install the ggplot2 package

install.packages("ggplot2")
library(ggplot2)

The grammar of ggplot2 includes:

  • aes() indicates how variables are mapped to visual properties or aesthetics. The first variable goes on the x-axis and the second variable goes on the y-axis.
  • geom_point() creates a scatterplot style graph. Alternatives to this are geom_line(), which creates a line plot and geom_histogram() which creates a histogram plot.

ggplot(data2000, aes(v2xcs_ccsi, v2cltort)) + geom_point() +
xlab("Civil society robustness") +
ylab("Freedom from torture")

Next we can add information on regime types, a categorical variable with four levels.

0 = closed autocracy

1 = electoral autocracy

2 = electoral democracy

3 = liberal democracy

In the aes() function, add colour = regime to differentiate the four categories on the graph

ggplot(data2000, aes(v2xcs_ccsi, v2x_clphy, colour = regime)) +
geom_point()

Alternatively we can use the facet_wrap( ~ regime) function to create four separate scatterplots and examine the relationship separately.

ggplot(data2000, aes(v2xcs_ccsi, v2x_clphy, colour = regime)) +
geom_point() +
facet_wrap(~regime) +
xlab("Civil society robustness") +
ylab("Freedom from torture")

Lastly, we can add a linear model line (method = "lm") with a grey standard error bar (se = TRUE) in the geom_smooth() function.

ggplot(data2000, aes(v2xcs_ccsi, v2x_clphy, colour = regime)) +
geom_point() +
facet_wrap(~regime) +
geom_smooth(method = "lm", se = TRUE) +
xlab("Civil society robustness") +
ylab("Freedom from torture")

In these graphs, we can see that as civil society robustness score increases, the likelihood of a life free from torture increases! Pretty intuitive result and we could argue that there is a third variable – namely strong democratic institutions – that drives this positive relationship.

The graphs break down this relationship across four different regime types, ranging from the most autocratic in the top left hand side to the most democratic in the bottom right. There is more variety in this relationship with closed autocracies (i.e. the red points), with some points deviating far from the line.

The purple graph – liberal democracies – shows a tiny amount of variance. In liberal democracies, it appears that all countries score highly in both civil society robustness and freedom from torture!