We can use the assign function to create new variables.
Most often I want to assign variables that I create to the Global Environment.
assign particularly useful in loops, simulations, and scenarios involving conditional variable naming or creation.
The basic syntax of the assign function is
assign(x, value, pos = -1, envir = as.environment(pos), inherits = FALSE)
envir: The environment in which to place the new variable. If not specified, it defaults to the current environment. .GlobalEnv is often used to assign variables in the global environment.
Generate variables with dynamic names in a loop.
for (i in 1:3) {
assign(paste("var", i, sep = "_"), i^2)
}
var_3
9
Next, we will make a for loop that iterates over each element in the years vector.
The paste0() function concatenates its arguments into a single string without any separator.
Here, it is used to dynamically create variable names by combining the string "sales_" with the current year. For example, if year is 2020, the result would be "sales_2020".
data.frame(month = 1:12, sales = sample(100:200, 12, replace = TRUE)) creates a new data frame for each iteration of the loop. The data frame has two columns:
month 1 to 12 and a random sample of 12 numbers (with replacement) from the integers between 100 and 200. This simulates monthly sales data.
The assign() function assigns a value to a variable in the R environment. The first argument is the name of the variable (as a string), and the second argument is the value to assign. In this snippet, assign() is used to create a new variable with the name generated by paste0() and assign the newly created data frame to it. This means that after each iteration, a new variable (e.g., sales_2020) will be created in the global environment, containing the corresponding data frame.
years <- 2018:2022
for (year in years) {
assign(paste0("sales_", year), data.frame(month = 1:12, sales = sample(100:200, 12, replace = TRUE)))
}
sales_2022
month sales
1 1 118
2 2 157
3 3 163
4 4 177
5 5 185
6 6 171
7 7 151
8 8 142
9 9 141
10 10 157
11 11 137
12 12 152
set.seed(1111)
years <- 2000:2005
countries <- c("Country A", "Country B", "Country C")
data <- expand.grid(year = years, country = countries)
data$value <- runif(n = nrow(data), min = 100, max = 200)
year country value
1 2018 Austria 146.5503
2 2019 Austria 141.2925
3 2020 Austria 190.7003
4 2021 Austria 113.7105
5 2022 Austria 173.8817
6 2018 Bahamams 197.6327
7 2019 Bahamams 187.9960
8 2020 Bahamams 111.6784
9 2021 Bahamams 154.6289
10 2022 Bahamams 114.0116
11 2018 Canada 100.1690
12 2019 Canada 174.8958
13 2020 Canada 175.0958
14 2021 Canada 163.3406
15 2022 Canada 186.8168
16 2018 Denmark 115.9363
17 2019 Denmark 191.6828
18 2020 Denmark 155.7007
19 2021 Denmark 190.0419
20 2022 Denmark 176.5887
data_list <- split(data, data$year)
data_list
$`2018`
year country value
1 2018 Austria 146.5503
6 2018 Bahamams 197.6327
11 2018 Canada 100.1690
16 2018 Denmark 115.9363
$`2019`
year country value
2 2019 Austria 141.2925
7 2019 Bahamams 187.9960
12 2019 Canada 174.8958
17 2019 Denmark 191.6828
$`2020`
year country value
3 2020 Austria 190.7003
8 2020 Bahamams 111.6784
13 2020 Canada 175.0958
18 2020 Denmark 155.7007
$`2021`
year country value
4 2021 Austria 113.7105
9 2021 Bahamams 154.6289
14 2021 Canada 163.3406
19 2021 Denmark 190.0419
$`2022`
year country value
5 2022 Austria 173.8817
10 2022 Bahamams 114.0116
15 2022 Canada 186.8168
20 2022 Denmark 176.5887
env <- .GlobalEnv
Now we can dynamically create variables within the environment
assign_year_country_dataframes <- function(data, year_col, country_col, env) {
# Get unique combinations of year and country
combinations <- unique(data[, c(year_col, country_col)])
# Iterate over each combination
for (i in 1:nrow(combinations)) {
combination <- combinations[i, ]
year <- combination[[year_col]]
country <- combination[[country_col]]
# Subset the data for the current combination
data_subset <- data[data[[year_col]] == year & data[[country_col]] == country, ]
# Create a dynamic variable name based on year and country
variable_name <- paste0(gsub(" ", "_", country), year)
# Assign the subset data to a dynamically named variable in the specified environment
assign(x = variable_name, value = data_subset, envir = env)
}
}
Now we can run the function and put all the country-year pairs into the global environment
assign_year_country_dataframes(data = data, year_col = "year", country_col = "country", env = env)




