Exploring data cubes for vector geometries in R
BEGIN Seminar Series
Online, University of St. Andrews, Oct. 15, 2024
Exploring DATA CUBES
for vector geometries in R
“Data cubes arise naturally when we observe properties of a set of geometries repeatedly over time.”
Chapter 6 - Data Cubes | Pebesma and Bivand (2023)
“Data cubes arise naturally when we observe properties of a set of geometries repeatedly over time.”
Chapter 6 - Data Cubes | Pebesma and Bivand (2023)
Photo from European Environment Agency
“Data cubes arise naturally when we observe properties of a set of geometries repeatedly over time.”
Chapter 6 - Data Cubes | Pebesma and Bivand (2023)
Photo by Instytut IMGW on Unsplash
Exploring data cubes for
VECTOR geometries in R
Exploring data cubes for
vector geometries in R
Simple feature collection with 1050 features and 5 fields
Geometry type: POINT
Dimension: XY
Bounding box: xmin: 12.18844 ymin: 47.62055 xmax: 13.12862 ymax: 47.8726
Geodetic CRS: WGS 84
# A tibble: 1,050 × 6
boxName time_start phenomenon unit arithmeticMean_1h
<chr> <dttm> <chr> <chr> <dbl>
1 1188BRD16 2024-10-01 00:00:00 PM10 µg/m³ 1.35
2 1188BRD16 2024-10-01 00:00:00 PM2.5 µg/m³ 0.754
3 1188BRD16 2024-10-01 00:00:00 Temperatur °C NA
4 13330108 2024-10-01 00:00:00 PM10 µg/m³ 0.162
5 13330108 2024-10-01 00:00:00 PM2.5 µg/m³ 0.160
6 13330108 2024-10-01 00:00:00 Temperatur °C 11.5
7 BayernLab Traunstein 2024-10-01 00:00:00 PM10 µg/m³ NA
8 BayernLab Traunstein 2024-10-01 00:00:00 PM2.5 µg/m³ NA
9 BayernLab Traunstein 2024-10-01 00:00:00 Temperatur °C 20.9
10 iDEAS:lab 2024-10-01 00:00:00 PM10 µg/m³ 0.787
# ℹ 1,040 more rows
# ℹ 1 more variable: geom <POINT [°]>
Data from the openSenseMap | © senseBox 2014 - 2020
{stars}
{stars}
stars object with 2 dimensions and 3 attributes
attribute(s):
Min. 1st Qu. Median Mean 3rd Qu. Max.
temperature [°C] 6.65000000 12.7511667 15.514167 17.01426 21.291500 24.56083
PM2.5 [µg/m³] 0.04090909 0.6726201 1.324079 12.93712 4.611875 66.39458
PM10 [µg/m³] 0.04227273 1.1006522 2.612205 24.76510 8.095625 149.11708
NA's
temperature [°C] 101
PM2.5 [µg/m³] 100
PM10 [µg/m³] 100
dimension(s):
from to offset delta refsys point
geom 1 7 NA NA WGS 84 TRUE
time 1 50 2024-10-01 UTC 1 hours POSIXct FALSE
values
geom POINT (12.91207 47.71819),...,POINT (12.45901 47.73125)
time NULL
{stars}
{stars}
Attributes to dimensions
stars object with 3 dimensions and 1 attribute
attribute(s):
Min. 1st Qu. Median Mean 3rd Qu. Max.
temperature.PM2.5.PM10 0.04090909 1.213 6.16875 18.24046 20.937 149.1171
NA's
temperature.PM2.5.PM10 301
dimension(s):
from to offset delta refsys point
geom 1 7 NA NA WGS 84 TRUE
time 1 50 2024-10-01 UTC 1 hours POSIXct FALSE
parameter 1 3 NA NA NA NA
values
geom POINT (12.91207 47.71819),...,POINT (12.45901 47.73125)
time NULL
parameter temperature, PM2.5 , PM10
{stars}
Aggregations
stars object with 2 dimensions and 3 attributes
attribute(s):
Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
temperature 6.92500000 12.0009994 14.807767 16.06928 21.175293 24.31104 6
PM2.5 0.06363636 0.8447386 1.359077 12.65861 2.894707 58.88761 6
PM10 0.29340909 1.4850795 3.433376 25.80386 6.734938 128.71625 6
dimension(s):
from to offset delta refsys point
time 1 3 2024-10-01 UTC 1 days POSIXct NA
geom 1 7 NA NA WGS 84 TRUE
values
time NULL
geom POINT (12.91207 47.71819),...,POINT (12.45901 47.73125)
{cubble}
{cubble}
# cubble: key: id [7], index: time, nested form, [sf]
# spatial: [12.19, 47.62, 13.13, 47.87], WGS 84
# temporal: time [dttm], temperature [[°C]], PM2.5 [[µg/m³]], PM10 [[µg/m³]]
id long lat geom ts
* <int> <dbl> <dbl> <POINT [°]> <list>
1 1 12.9 47.7 (12.91207 47.71819) <tibble [50 × 4]>
2 2 12.2 47.6 (12.18844 47.62055) <tibble [50 × 4]>
3 3 12.6 47.9 (12.64558 47.86673) <tibble [50 × 4]>
4 4 13.0 47.8 (13.03966 47.82361) <tibble [50 × 4]>
5 5 13.1 47.7 (13.12862 47.65276) <tibble [50 × 4]>
6 6 13.0 47.9 (12.97138 47.8726) <tibble [50 × 4]>
7 7 12.5 47.7 (12.45901 47.73125) <tibble [50 × 4]>
{cubble}
# cubble: key: id [7], index: time, long form
# temporal: 2024-10-01 -- 2024-10-03 01:00:00 [1h], no gaps
# spatial: long [dbl], lat [dbl], geom [POINT [°]]
id time temperature PM2.5 PM10
<int> <dttm> [°C] [µg/m³] [µg/m³]
1 1 2024-10-01 00:00:00 NA 0.754 1.35
2 1 2024-10-01 01:00:00 NA 0.707 1.60
3 1 2024-10-01 02:00:00 NA 0.885 2.05
4 1 2024-10-01 03:00:00 NA 0.841 1.70
5 1 2024-10-01 04:00:00 NA 0.878 1.57
6 1 2024-10-01 05:00:00 NA 0.8 1.61
7 1 2024-10-01 06:00:00 NA 1.79 3.87
8 1 2024-10-01 07:00:00 NA 1.15 3.41
9 1 2024-10-01 08:00:00 NA 1.09 2.44
10 1 2024-10-01 09:00:00 NA 0.865 2.06
# ℹ 340 more rows
{cubble}
Exploring data cubes for
vector geometries
DYNAMIC SPATIAL DATA in R
Simple feature collection with 25 features and 2 fields
Geometry type: POLYGON
Dimension: XY
Bounding box: xmin: -0.2974337 ymin: -0.00297557 xmax: 0.9730806 ymax: 1.153558
Geodetic CRS: WGS 84
First 10 features:
gid datetime geometry
1 a 2020-10-01 POLYGON ((0.5474949 0.80889...
2 b 2020-10-01 POLYGON ((0.2791708 0.83373...
3 c 2020-10-01 POLYGON ((0.2807462 0.62779...
4 d 2020-10-01 POLYGON ((0.7650701 0.47444...
5 e 2020-10-01 POLYGON ((0.3825692 0.35378...
6 a 2020-10-02 POLYGON ((0.4961102 0.87283...
7 b 2020-10-02 POLYGON ((0.3298312 0.76120...
8 c 2020-10-02 POLYGON ((0.328914 0.568743...
9 d 2020-10-02 POLYGON ((0.7217233 0.52617...
10 e 2020-10-02 POLYGON ((0.3101455 0.31689...
Simple feature collection with 4261 features and 26 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: 407025.2 ymin: 8502331 xmax: 845635.3 ymax: 8968083
Projected CRS: WGS 84 / UTM zone 33N
# A tibble: 4,261 × 27
NAME Comment IDENT YEAR_ SOURCE ANALYST YR_UNCERT Shape_Area FLAG LENGTH
<chr> <chr> <chr> <int> <chr> <chr> <chr> <dbl> <int> <dbl>
1 Svitjod… <NA> 1581… 1966 Verti… Max Kö… <NA> 42844621. 0 11747.
2 Biskaye… <NA> 16118 1966 Verti… Max Kö… <NA> 13012090. 0 3778.
3 Balberg… <NA> 16601 1966 Verti… Max Kö… <NA> 5004077. 0 3382.
4 Paradis… <NA> 16201 1966 Verti… Max Kö… <NA> 7632368. 0 5293.
5 Kvittop… <NA> 1611… 1966 Verti… Max Kö… <NA> 2778791. 0 2613.
6 Arlabre… <NA> 16119 1966 Verti… Max Kö… <NA> 6158767. 0 5094.
7 Landbre… <NA> 16603 1966 Verti… Max Kö… <NA> 4330581. 0 4064.
8 Bybreen <NA> 16512 1966 Verti… Max Kö… <NA> 2342140. 0 2973.
9 Evabreen <NA> 16202 1966 Verti… Max Kö… <NA> 6295764. 0 4597.
10 Tindebr… <NA> 16107 1966 Verti… Max Kö… <NA> 1186398. 0 2355.
# ℹ 4,251 more rows
# ℹ 17 more variables: FWIDTH <dbl>, Shape_Peri <dbl>, SOURCE2 <chr>,
# SatelliteI <chr>, TIDEWATER <int>, NumLines <int>, GLIMSID <chr>,
# debroxglac <int>, medZ <dbl>, minZ <dbl>, maxZ <dbl>, stdZ <dbl>,
# skew <dbl>, meanSLP <dbl>, meanASP <dbl>, DEM <chr>,
# geom <MULTIPOLYGON [m]>
Data from König et al. (2014)
Data from König et al. (2014)
{post}
post_array
OBJECTSstars object with 2 dimensions and 25 attributes
attribute(s):
geom NAME Comment SOURCE
MULTIPOLYGON :33440 Length:33440 Length:33440 Length:33440
epsg:32633 : 0 Class :character Class :character Class :character
+proj=utm ...: 0 Mode :character Mode :character Mode :character
ANALYST YR_UNCERT Shape_Area FLAG
Length:33440 Length:33440 Min. :4.640e+04 Min. :0.000
Class :character Class :character 1st Qu.:7.367e+05 1st Qu.:0.000
Mode :character Mode :character Median :2.053e+06 Median :0.000
Mean :1.897e+07 Mean :0.008
3rd Qu.:7.608e+06 3rd Qu.:0.000
Max. :1.243e+09 Max. :2.000
NA's :29179 NA's :29179
LENGTH FWIDTH Shape_Peri SOURCE2
Min. : 0 Min. : 0.0 Min. : 936.8 Length:33440
1st Qu.: 1277 1st Qu.: 193.0 1st Qu.: 4678.2 Class :character
Median : 2276 Median : 332.8 Median : 8861.5 Mode :character
Mean : 4283 Mean : 720.5 Mean : 22962.0
3rd Qu.: 4607 3rd Qu.: 680.0 3rd Qu.: 20465.8
Max. :64859 Max. :65609.9 Max. :643020.8
NA's :29179 NA's :29179 NA's :29179
SatelliteI TIDEWATER NumLines GLIMSID
Length:33440 Min. :0.00 Min. :0.00 Length:33440
Class :character 1st Qu.:0.00 1st Qu.:1.00 Class :character
Mode :character Median :0.00 Median :1.00 Mode :character
Mean :0.12 Mean :1.27
3rd Qu.:0.00 3rd Qu.:1.00
Max. :1.00 Max. :9.00
NA's :31772 NA's :31772
debroxglac medZ minZ maxZ
Min. :0.00 Min. : 0.0 Min. : 0.00 Min. : 0.0
1st Qu.:0.00 1st Qu.: 349.0 1st Qu.: 70.75 1st Qu.: 538.8
Median :0.00 Median : 454.0 Median : 214.00 Median : 688.0
Mean :0.05 Mean : 484.6 Mean : 234.13 Mean : 722.7
3rd Qu.:0.00 3rd Qu.: 595.0 3rd Qu.: 354.25 3rd Qu.: 879.0
Max. :1.00 Max. :1215.0 Max. :1177.00 Max. :1691.0
NA's :31772 NA's :31772 NA's :31772 NA's :31772
stdZ skew meanSLP meanASP
Min. : 0.00 Min. :-2.01 Min. : 0.00 Min. : 0.00
1st Qu.: 73.19 1st Qu.:-0.40 1st Qu.: 7.92 1st Qu.: 55.44
Median :103.44 Median :-0.10 Median :12.90 Median :149.46
Mean :109.82 Mean :-0.11 Mean :12.71 Mean :172.08
3rd Qu.:137.90 3rd Qu.: 0.20 3rd Qu.:16.61 3rd Qu.:296.63
Max. :341.63 Max. : 1.91 Max. :42.83 Max. :359.74
NA's :31772 NA's :31772 NA's :31772 NA's :31772
DEM
Length:33440
Class :character
Mode :character
dimension(s):
from to refsys point
geom_sum 1 1672 WGS 84 / UTM zone 33N TRUE
YEAR_ 1 20 NA FALSE
values
geom_sum POINT (630823.1 8746977),...,POINT (827599.3 8951776)
YEAR_ [1936,1960),...,[2010,2011)
post_table
OBJECTS# cubble: key: IDENT [1672], index: YEAR_, nested form, [sf]
# spatial: [409177.49, 8504927.97, 827599.3, 8951776.14], WGS 84 / UTM zone
# 33N
# temporal: Comment [chr], YEAR_ [int], SOURCE [chr], ANALYST [chr], YR_UNCERT
# [chr], Shape_Area [dbl], FLAG [int], LENGTH [dbl], FWIDTH [dbl], Shape_Peri
# [dbl], SOURCE2 [chr], SatelliteI [chr], TIDEWATER [int], NumLines [int],
# GLIMSID [chr], debroxglac [int], medZ [dbl], minZ [dbl], maxZ [dbl], stdZ
# [dbl], skew [dbl], meanSLP [dbl], meanASP [dbl], DEM [chr], geom
# [MULTIPOLYGON [m]]
IDENT NAME x y geom_sum ts
* <chr> <chr> <dbl> <dbl> <POINT [m]> <list>
1 11101 Pedasjenkobreen 630823. 8746977. (630823.1 8746977) <sf [2 × 25]>
2 11102.1 Ganskijbreen 625278. 8747791. (625278.3 8747791) <sf [2 × 25]>
3 11102.2 Ganskijbreen 623226. 8748129. (623226 8748129) <sf [2 × 25]>
4 11103 Sonklarbreen 611755. 8745898. (611755.3 8745898) <sf [2 × 25]>
5 11104 Helge Backlundbreen 607633. 8735580. (607633.4 8735580) <sf [2 × 25]>
6 11105.1 Negribreen 578058. 8734897. (578058 8734897) <sf [2 × 25]>
7 11105.2 Gardebreen 598368. 8738104. (598368.2 8738104) <sf [2 × 25]>
8 11106.1 Johansenbreen 580848. 8720961. (580847.7 8720961) <sf [3 × 25]>
9 11106.2 Petermannbreen 578589. 8715092. (578588.8 8715092) <sf [3 × 25]>
10 11107.1 <NA> 586945. 8713111. (586945.2 8713111) <sf [3 × 25]>
# ℹ 1,662 more rows
post_table
OBJECTS# cubble: key: IDENT [1672], index: YEAR_, long form
# temporal: 1936 -- 2010 [1Y], has gaps!
# spatial: NAME [chr], x [dbl], y [dbl], geom_sum [POINT [m]]
IDENT Comment YEAR_ SOURCE ANALYST YR_UNCERT Shape_Area FLAG LENGTH FWIDTH
* <chr> <chr> <int> <chr> <chr> <chr> <dbl> <int> <dbl> <dbl>
1 11101 <NA> 1966 Verti… Christ… Can be 1… 55988042. 0 11309. 2129.
2 11101 <NA> 2008 SPOT5… Christ… <NA> 50864568. 0 9903. 2876.
3 11102.1 <NA> 1966 Verti… Christ… Can be 1… 12162520. 0 8770. 1218.
4 11102.1 <NA> 2008 SPOT5… Christ… <NA> 10951911. 0 8373. 758.
5 11102.2 <NA> 1966 Verti… Christ… Can be 1… 9146020. 0 6927. 482.
6 11102.2 <NA> 2008 SPOT5… Christ… <NA> 8899379. 0 6236. 351.
7 11103 <NA> 1966 Verti… Christ… Can be 1… 247724101. 0 17649. 7660.
8 11103 <NA> 2008 SPOT5… Christ… <NA> 222499321. 0 13799. 7357.
9 11104 <NA> 1966 Verti… Christ… Can be 1… 23432664. 0 6340. 2576.
10 11104 <NA> 2008 SPOT5… Christ… <NA> 20067760. 0 5659. 1101.
# ℹ 4,251 more rows
# ℹ 16 more variables: Shape_Peri <dbl>, SOURCE2 <chr>, SatelliteI <chr>,
# TIDEWATER <int>, NumLines <int>, GLIMSID <chr>, debroxglac <int>,
# medZ <dbl>, minZ <dbl>, maxZ <dbl>, stdZ <dbl>, skew <dbl>, meanSLP <dbl>,
# meanASP <dbl>, DEM <chr>, geom <MULTIPOLYGON [m]>
📝 A post on the contents of this talk and a workflow to fetch and arrange in a VDC the station data from openSenseMap
📕 Spatial Data Science with Applications in R book, especially Chapter 6 - Data Cubes
📦 Documentation on the {stars}
package
📦 Documentation on the {cubble}
package
📦 Documentation on the {post}
package
⚠️The package is work in progress! Any feedback is warmly appreciated.
Cover photo by Eren Namlı on Unsplash