Gleb Satyukov
Senior Research Engineer | Data Science Instructor
Population: ~87,486
Land Area: ~180 sq miles
Wikipedia: https://en.wikipedia.org/wiki/Andorra
R Basics 1: https://environ-175.com/basics/1
R Basics 2: https://environ-175.com/basics/2
R Basics 3: https://environ-175.com/basics/3
R Basics 4: https://environ-175.com/basics/4
R Basics 5: https://environ-175.com/basics/5
R Advanced 1: https://environ-175.com/advanced/1
R Advanced 2: https://environ-175.com/advanced/2
R Advanced 3: https://environ-175.com/advanced/3
R Advanced 4: https://environ-175.com/advanced/4
R Advanced 5: https://environ-175.com/advanced/5
R Spatial 1: https://environ-175.com/spatial/1
R Spatial 2: https://environ-175.com/spatial/2
R Spatial 3: https://environ-175.com/spatial/3
R Spatial 4: https://environ-175.com/spatial/4
R Spatial 5: https://environ-175.com/spatial/5
Clean your environment
Use proper file paths, use data folder
Use proper code spacing, use even more spacing!
Use inline and block comments!!
Use correct variable names (lowercase and underscores)
Save charts programmatiaclly with ggsave
Save final data programmatiaclly with write_csv
Using Global Variables
Set a directory using path_main
Keep your data in a dedicated data folder
Inspect the data after loading using head()
Be consistent with your use of quotes (' vs ")
Make sure to export both graphs and final data (using write_csv(data, path)
)
Follow instructions in the assignments exactly
Import raster data
Import vector data
Coordinate Reference Systems
Plot maps with rasters and vectors
New libraries!
library(terra)
library(tidyterra)
library(sf)
We will learn how to combine and manipulate different spatial data objects
As well as combining spatial objects with tables and other rectangular type data
Pros:
- Drawing elements, e.g. roads
- Viewing different layers
- Clicking on cells or shapes
Cons:
- ArcGIS is proprietory software
- GUI-Based
Free and Open Source
- Easier for statistical analysis
- Code-driven, familiar tools
- Transparency, reproducibility, and automation
- Great for building data science workflows
There are 2 distinct classes of spatial data:
Raster data
Vector data
Each type of data will need to be treated differently!
We will learn a new set of operations for spatial data
Vectors are a series of points
One important difference between rasters and vectors is that rasters give a value for every pixel on the map
Convesely, vector points, lines, and polygons don't usually indicate a value
a type of vector data
contains lat and long
can be stored as a csv
Shape files are a common format for storing vector data
Shape files come with a set of different files
Typically they come in a single zipped bundle
borders.zip
- borders.cpg
- borders.dbf
- borders.prj
- borders.shp
- borders.shx
File | Extension | Description |
---|---|---|
borders.shp |
.shp | Main file — stores the actual shapes (geometry) |
borders.shx |
.shx | Index file — helps software locate features quickly |
borders.dbf |
.dbf | Attribute table — stores data about each shape (like country names, population, etc.) |
borders.prj |
.prj | Projection info — defines the coordinate reference system (CRS: for example WGS84 for latitude/longitude) |
borders.cpg |
.cpg | (Optional) Character encoding for text data |
Some of the more common CRS are the World Geodetic System (WGS84), the North American Datum 1983 (NAD83), and Universal Transverse Mercator (UTM)
You can load the shapefile using the sf
package:
library(sf)
hawaii_borders <- st_read("path/to/folder/borders.shp")
borders.*
)Note that instead of using our traditional View()
function, we are now using the built-in or Base R plot()
function to inspect our shapesfile data
#####################
# Inspecting the data
#####################
head(hawaii_borders)
plot(hawaii_borders)
Our new best practice is to check which CRS is used
CRS has to match across different spatial objects
#######################
# STEP 4. CHECK / FIX CRS
#######################
crs(plastics, describe=TRUE)
crs(hawaii_borders, describe=TRUE)
Cartographic Boundary Files
We'll need to use Hawaii borders data from the Census:
https://www.census.gov/geographies/mapping-files/time-series/geo/cartographic-boundary.html
One of the most common systems is based on latitude and longitude:
- Latitude tells you how far north or south a place is from the Equator — the imaginary line that circles the Earth halfway between the poles
- Longitude tells you how far east or west a place is from the prime meridian, which runs from pole to pole through Greenwich, England
Mercator projection preserves direction and is useful for navigation. But distances and areas are distorted, especially near the polar regions: https://en.wikipedia.org/wiki/Mercator_projection
Gall-Peters projections: https://en.wikipedia.org/wiki/Gall%E2%80%93Peters_projection
Other projections: https://en.wikipedia.org/wiki/List_of_map_projections
There is no perfect map projection because we are represent a 3D surface of our spherical Earth onto a 2D surface — which will always introduce distortion
Each projection must sacrifice accuracy in at least one of these areas:
- Shape
- Area
- Distance
- Direction
Mandatory in European Union starting July 2024
https://en.wikipedia.org/wiki/Bottle_cap#Tethered_CapsThe aim of this project is to locate the Great Pacific Garbage Patch
It is an accumulation of marine debris, rimarily consisting of plastics
This garbage patch is located somewhere in the North Pacific Ocean
The Great Pacific Garbage Patch poses significant environmental threats
Animals can ingest the plastics, harm to marine life
Introduces harmful chemicals into the marine food chain
animals can get tangled up in the plastics and die
The patch is estimated to be larger than the size of Texas, though its exact size and boundaries can vary due to factors such as wind and ocean currents
We are going to see if we can identify it's size and location on June 1 2017, using NASA estimates derived from satellite data
The estimates of microsplastics concentrates come from NASA's CYGNSS project
Scientists Use NASA Satellite Data to Track Ocean Microplastics From Space: https://www.nasa.gov/centers-and-facilities/goddard/scientists-use-nasa-satellite-data-to-track-ocean-microplastics-from-space/
Emergence of a neopelagic community through the establishment of coastal species on the high seas
The varying shades of red illustrate concentration of plastics/microplastic
There are about 4 million microplastic particles (about 1mm in size) per square kilometer in the worst spots
Note: this data is not an aerial photograph or a satellite image as you might see from space
1. Clean up environment
2. Load required libraries
3. Import raster plastics data
4. Import vector border data
5. Check the Coordinate Reference System
6. Plot our map with ggplot()
7. Export plot with ggsave()
Will be published on canvas today
Assignment is going to be due this Friday
Due Date: Friday May 23, 2025 at 11:59 pm PT