3  R-spatial ecosystem

In this session you should learn:

A small survey 📝

To tailor the following sessions to your interests, please fill in this survey. Thank you! 💐

3.1 R-spatial ecosystem

R-Spatial can be loosely defined as the ecosystem of code, projects and people using R for working with and adding value to spatial data.1

The R-spatial organization attempts to harmonise packages for spatial analysis in R, with a large number of R packages depending on {sf}, the core package for spatial data handling.

A good overview of the available spatial libraries can be found in the CRAN Task View: Analysis of Spatial Data

R-spatial ecosystem from E. Pebesma & Bivand (2023)

R-spatial ecosystem from E. Pebesma & Bivand (2023)

3.1.1 Rspatial counterpart

Not to be confused with R-spatial, the Rspatial resources include a different way to handle geospatial data in R. The main package {terra} provides classes and methods to represent raster and vector objects. It works well on large data sets and is a fast implementation mainly because it is written directly in C++. This means that, for example, a Python module could also be built on top of this library2. {terra} is based on C++ modules like gdal, geos and ncdf and builds it’s own spat module on C++ that interacts with R via the Rcpp module3.

{terra} set-up from Hijmans (2020)

{terra} set-up from Hijmans (2020)

3.2 Geometries (R2)

As the name of the package says, {sf} is meant to deal with simple features. Polygon geometries in particular also have to follow certain rules in order to be valid simple features.

library(sf)
Linking to GEOS 3.12.1, GDAL 3.8.4, PROJ 9.3.1; sf_use_s2() is TRUE
(ls <- st_linestring(rbind(c(0,0), c(1,1), c(2,2), c(0,2), c(1,1), c(2,0))))
LINESTRING (0 0, 1 1, 2 2, 0 2, 1 1, 2 0)
plot(ls)

st_is_simple(ls)
[1] FALSE

Is the ls object valid?

st_is_valid(ls)
[1] TRUE

Read more here

{sf} also supports a number of spatial concepts:

  • Z and M coordinates (Z referes to elevation and M referes to any other measure)
  • Empty geometries
  • Text and binary encodings (WKT and WKB)
ls # wkt
LINESTRING (0 0, 1 1, 2 2, 0 2, 1 1, 2 0)
ls |> st_as_binary() # wkb
  [1] 01 02 00 00 00 06 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 [26] 00 00 00 00 00 00 f0 3f 00 00 00 00 00 00 f0 3f 00 00 00 00 00 00 00 40 00
 [51] 00 00 00 00 00 00 40 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 40 00 00
 [76] 00 00 00 00 f0 3f 00 00 00 00 00 00 f0 3f 00 00 00 00 00 00 00 40 00 00 00
[101] 00 00 00 00 00

3.2.1 Operations on geometries

Operations can be applied to a single geometry (unary), to a pair of geometries (binary) and to a set of geometries (n-ary operation).

Results of this operations include:

  • logical (predicates)
  • a quantity (measures) area, length, distance
  • a new geometry (transformations) e.g. centroid, buffer, bounding box, intersection, union, etc.

Binary topological relations are represented by the binary predicates. They are implementations of the Dimensionally Extended Nine-Intersection Model (DE-9IM).

Examples of topological relations from Lovelace et al. (2019)

Examples of topological relations from Lovelace et al. (2019)

Read more about binary predicates here.

Understanding the DE-9IM relations allow you to establish your own relations. See how in section 4.2.4 of the Spatial Data Operations chapter of Lovelace et al. (2019).

3.3 Spherical geometries (S2)

The R2 geometries explained above are defined on the plane. When we consider geometries on the sphere, we are talking about spherical geometries.

Most of the geospatial libraries that apply spatial operations (GDAL, GEOS, etc.) do not really consider the ellipsoidal shape of the Earth. This is why the s2geometry4 library was built, to apply operations on the sphere, a better representation of the ellipsoid than the plane.

Straight lines don’t really exist. The shortest line connecting two points is a great circle segment.

We can define a full polygon which comprises the entire surface of the Earth. Source: E. Pebesma & Bivand (2023)

Bounding boxes and rectangles crossing the antimeridean and validity on the sphere. Source: E. Pebesma & Bivand (2023)
Figure 3.1: Particularities of spherical geometries

3.3.1 Switching between S2 and R2

Although S2 is a more accurate way of performing spatial operations, not every operation is implemented fully yet. To switch between using GEOS (R2) or s2 you can do:

sf_use_s2(FALSE)
Spherical geometry (s2) switched off

3.3.2 Operations on geodetic (unprojected) data

When sf_use_s2() is FALSE, operations like st_area(), st_distance() and st_length() will compute on the geoid using GEOS instead of on the sphere, which is the default. The former is a slower computation.

For unary operations that receive a distance argument, such as st_buffer() the distance argument is taken in meters when sf_use_s2() is TRUE and in degrees when FALSE, which also gives a warning.

In case x has geodetic coordinates (lon/lat) and sf_use_s2() is TRUE, a numeric dist is taken as distance in meters and a units object in dist is converted to meters. In case x has geodetic coordinates (lon/lat) and sf_use_s2() is FALSE, a numeric dist is taken as degrees, and a units object in dist is converted to arc_degree (and warnings are issued). In case x does not have geodetic coordinates (projected) then numeric dist is assumed to have the units of the coordinates, and a units dist is converted to those if st_crs(x) is not NA.

3.4 Raster, arrays and data cubes

Raster types. Source: E. Pebesma & Bivand (2023)

The packages {stars} (E. Pebesma, 2024) and {terra} (Hijmans, 2024b) are the engines to handle raster datasets in R.

Due to its performance, {terra} took a significant role in the spatial handling of raster data.

library(terra)
terra 1.8.10
tif = system.file("tif/L7_ETMs.tif", package = "stars")
(rast1 = rast(tif))
class       : SpatRaster 
dimensions  : 352, 349, 6  (nrow, ncol, nlyr)
resolution  : 28.5, 28.5  (x, y)
extent      : 288776.3, 298722.8, 9110729, 9120761  (xmin, xmax, ymin, ymax)
coord. ref. : SIRGAS 2000 / UTM zone 25S 
source      : L7_ETMs.tif 
names       : L7_ETMs_1, L7_ETMs_2, L7_ETMs_3, L7_ETMs_4, L7_ETMs_5, L7_ETMs_6 

Nevertheless, the {stars} package is not only design to handle raster data, but mainly array representations (list of matrices), which makes it powerful for data cube representations, both as vector and raster representations.

library(stars)
Loading required package: abind
(rast2 = read_stars(tif))
stars object with 3 dimensions and 1 attribute
attribute(s):
             Min. 1st Qu. Median     Mean 3rd Qu. Max.
L7_ETMs.tif     1      54     69 68.91242      86  255
dimension(s):
     from  to  offset delta                     refsys point x/y
x       1 349  288776  28.5 SIRGAS 2000 / UTM zone 25S FALSE [x]
y       1 352 9120761 -28.5 SIRGAS 2000 / UTM zone 25S FALSE [y]
band    1   6      NA    NA                         NA    NA    
(rast3 = read_stars(c(tif,tif), along = "time"))
stars object with 4 dimensions and 1 attribute
attribute(s), summary of first 1e+05 cells:
             Min. 1st Qu. Median    Mean 3rd Qu. Max.
L7_ETMs.tif    47      65     76 77.3419      87  255
dimension(s):
     from  to  offset delta                     refsys point x/y
x       1 349  288776  28.5 SIRGAS 2000 / UTM zone 25S FALSE [x]
y       1 352 9120761 -28.5 SIRGAS 2000 / UTM zone 25S FALSE [y]
band    1   6      NA    NA                         NA    NA    
time    1   2      NA    NA                         NA    NA    

Raster-vector operations work well with both packages, where {terra} and {stars} can both interact with sf objects.

3.5 Further reading:


  1. https://www.osgeo.org/projects/r-spatial/↩︎

  2. In theory, but has not been implemented yet↩︎

  3. Even though a C++ implementation is in many cases faster, it is harder to understand, debug or modify.↩︎

  4. The C++ s2geometry library was written by Google and is used in many of its products (e.g. Google Maps, Google Earth Engine, Bigquery GIS).↩︎