Gleb Satyukov
Senior Research Engineer | Data Science Instructor
R Basics 1: https://environ-175.com/basics/1
R Basics 2: https://environ-175.com/basics/2
R Basics 3: https://environ-175.com/basics/3
R Basics 4: https://environ-175.com/basics/4
R Basics 5: https://environ-175.com/basics/5
R Advanced 1: https://environ-175.com/advanced/1
R Advanced 2: https://environ-175.com/advanced/2
R Advanced 3: https://environ-175.com/advanced/3
R Advanced 4: https://environ-175.com/advanced/4
R Advanced 5: https://environ-175.com/advanced/5
R Spatial 1: https://environ-175.com/spatial/1
R Spatial 2: https://environ-175.com/spatial/2
R Spatial 3: https://environ-175.com/spatial/3
R Spatial 4: https://environ-175.com/spatial/4
R Spatial 5: https://environ-175.com/spatial/5
Updated additional resources list!
🔥 File Paths
🔥 Saving in the Cloud
🔥 Creating New Variables
🔥 Function Signature Decomposition
🔥 File Formats: TXT vs CSV vs TSV
🔥 Mutate Function and more on ggplot
🔥 R Basics 3: Assignment Stuff
Each OS has the same file system approach (FS)
UNIX-based systems use forward slashes ('/')
Windows uses backward slashes ('\')
Absolute vs Relative paths
/Users/gleb/Documents/Environ-175/Basics-2/assignment-2.R
/Users/gleb/Documents/Environ-175/Basics-1/assignment-1.R
/Users/gleb/Documents/Environ-175/Basics-2/assignment-2.R
/Users/gleb/Documents/Environ-175/Basics-3/assignment-3.R
/Users/gleb/Documents/Environ-175/Basics-4/assignment-4.R
/Users/gleb/Documents/Environ-175/Basics-5/assignment-5.R
C:/Users/gleb/Documents/Environ-175/Basics-1/assignment-1.R
C:/Users/gleb/Documents/Environ-175/Basics-2/assignment-2.R
C:/Users/gleb/Documents/Environ-175/Basics-3/assignment-3.R
C:/Users/gleb/Documents/Environ-175/Basics-4/assignment-4.R
C:/Users/gleb/Documents/Environ-175/Basics-5/assignment-5.R
Preferred way in R on Windows:
read_csv("C:/Users/johndoe/Documents/Environ-175/Basics-1/air_quality.csv")
Or double backslashes (less common):
read_csv("C:\\Users\\johndoe\\Documents\\Environ-175\\Basics-1\\air_quality.csv")
Example file path on any Ubuntu/CentOS/macOS/etc:
read_csv("/Users/janedoe/Documents/Environ-175/Basics-1/air-quality.csv")
Start at the root
of the filesystem
Also known as the /
🔥 Start at the "current" folder
🔥 Current folder is relative to the current file
🔥 The file that is the one being executed right now
.
is the current directory
..
is one level up ↑
./
is the current directory
../
is one level up ↑
a/b
a//b
a///////b
a/./b
a/../a/b
🔥 Absolute paths
🔥 Easier to understand
🔥 Explicit is better than implicit
Microsoft OneDrive
Google Drive
Dropbox
iCloud
...
NordLocker
BackBlaze
Proton
It's better than nothing?
It will affect your filepaths/ file structure
You can't back up the entire world
So which folders are you going to save?
first_name <- "Alice"
temperature_2025 <- 76.5
No abbreviations, please
Spell out whatever you are trying to say
Make sure your variable names are specific
This way you avoid accidentally overwriting them
What does this code do?
(if you had to guess)
a1581f2a5 <- 20
bcc1f489b <- 35
cee4d0fe2 <- a1581f2a5 * bcc1f489b
print(cee4d0fe2)
What does this code do?
a <- 20
b <- 35
c <- a * b
print(c)
What does this code do?
hourly_wages <- 20
hours_worked <- 35
total_wages <- hourly_wages * hours_worked
print(total_wages)
line 25:
data <- read_csv("/Users/Jim/Docs/ENV175/Basics-2/nlsy.csv")
line 26:
data <- collap(data, child_gpa ~ mom_hsgrad, FUN=fmean)
❌ REALLY NOT OK
How do you clear the environment?
rm(list=ls())
How do you clear the console?
cat("\014")
How do you clear any plots?
dev.off()
# ────────────────────────────────────────────────
# Clears Out Everything
# ────────────────────────────────────────────────
clear_all <- function() {
rm(list = ls()) # clears the environment
cat('\014') # clears the console
dev.off() # clears the plots
}
# calling the function
clear_all()
A function is a reusable block of code that does something specific.
The keyword here really is to be reusable and specific
It's like a recipe that you can use again and again! 👨🍳
my_function <- function(arg1, arg2 = "default") {
# Do something with arg1 and arg2
}
my_function
– the function name
arg1
– a required input
arg2 = "default"
– optional input with default value
It is the part of the function that tells you:
mean(x, trim = 0, na.rm = FALSE)
E.G. is the signature for the built-in mean()
function.
mean(x, trim = 0, na.rm = FALSE) -> Integer
This is the signature for the built-in mean()
function.
It tells you:
A pure function is a function that:
add_numbers <- function(x, y) {
return(x + y)
}
This is a pure function: no surprises
mutate()
Do?mutate()
adds new columns to a data frame
Originally: this is in the dplyr
package.
library(dplyr)
new_data <- mutate(mtcars, km_per_liter = mpg * 0.425144)
Here, we're converting miles per gallon (mpg) to kilometers per liter (km/L) 🚗
Let's add a new column that estimates fuel cost per 100 miles:
mutate(mtcars,
cost_per_100mi = (100 / mpg) * 4.50)
mutate(mtcars,
km_per_liter = mpg * 0.425144,
cost_per_100mi = (100 / mpg) * 4.50)
Each column can build on others you just created in the same mutate()
call
A bushel is an old-school unit of measurement used mainly for dry stuff like grain, fruit, or veggies.
📏 It's equal to about 9? gallons (around 35 liters).
That's like 2 full backpacks worth of apples! 🍎🍎
So when we say "U.S. corn yield was 170 bushels per acre"… now you know! 🌽
Published already
Assignment is going to be due tonight
Tuesday April 15, 2025 at 11:59 pm PT