Exploring data cubes for vector geometries in R
BEGIN Seminar Series
 Online, University of St. Andrews, Oct. 15, 2024
Exploring DATA CUBES
for vector geometries in R
“Data cubes arise naturally when we observe properties of a set of geometries repeatedly over time.”
Chapter 6 - Data Cubes | Pebesma and Bivand (2023)
“Data cubes arise naturally when we observe properties of a set of geometries repeatedly over time.”
Chapter 6 - Data Cubes | Pebesma and Bivand (2023)
Photo from European Environment Agency
“Data cubes arise naturally when we observe properties of a set of geometries repeatedly over time.”
Chapter 6 - Data Cubes | Pebesma and Bivand (2023)
Photo by Instytut IMGW on Unsplash
Exploring data cubes for
VECTOR geometries in R
Exploring data cubes for
vector geometries in R
Simple feature collection with 1050 features and 5 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 12.18844 ymin: 47.62055 xmax: 13.12862 ymax: 47.8726
Geodetic CRS:  WGS 84
# A tibble: 1,050 × 6
   boxName              time_start          phenomenon unit  arithmeticMean_1h
   <chr>                <dttm>              <chr>      <chr>             <dbl>
 1 1188BRD16            2024-10-01 00:00:00 PM10       µg/m³             1.35 
 2 1188BRD16            2024-10-01 00:00:00 PM2.5      µg/m³             0.754
 3 1188BRD16            2024-10-01 00:00:00 Temperatur °C               NA    
 4 13330108             2024-10-01 00:00:00 PM10       µg/m³             0.162
 5 13330108             2024-10-01 00:00:00 PM2.5      µg/m³             0.160
 6 13330108             2024-10-01 00:00:00 Temperatur °C               11.5  
 7 BayernLab Traunstein 2024-10-01 00:00:00 PM10       µg/m³            NA    
 8 BayernLab Traunstein 2024-10-01 00:00:00 PM2.5      µg/m³            NA    
 9 BayernLab Traunstein 2024-10-01 00:00:00 Temperatur °C               20.9  
10 iDEAS:lab            2024-10-01 00:00:00 PM10       µg/m³             0.787
# ℹ 1,040 more rows
# ℹ 1 more variable: geom <POINT [°]>Data from the openSenseMap | © senseBox 2014 - 2020
{stars}{stars}stars object with 2 dimensions and 3 attributes
attribute(s):
                       Min.    1st Qu.    Median     Mean   3rd Qu.      Max.
temperature [°C] 6.65000000 12.7511667 15.514167 17.01426 21.291500  24.56083
PM2.5 [µg/m³]    0.04090909  0.6726201  1.324079 12.93712  4.611875  66.39458
PM10 [µg/m³]     0.04227273  1.1006522  2.612205 24.76510  8.095625 149.11708
                 NA's
temperature [°C]  101
PM2.5 [µg/m³]     100
PM10 [µg/m³]      100
dimension(s):
     from to         offset   delta  refsys point
geom    1  7             NA      NA  WGS 84  TRUE
time    1 50 2024-10-01 UTC 1 hours POSIXct FALSE
                                                      values
geom POINT (12.91207 47.71819),...,POINT (12.45901 47.73125)
time                                                    NULL{stars}{stars}Attributes to dimensions
stars object with 3 dimensions and 1 attribute
attribute(s):
                              Min. 1st Qu.  Median     Mean 3rd Qu.     Max.
temperature.PM2.5.PM10  0.04090909   1.213 6.16875 18.24046  20.937 149.1171
                        NA's
temperature.PM2.5.PM10   301
dimension(s):
          from to         offset   delta  refsys point
geom         1  7             NA      NA  WGS 84  TRUE
time         1 50 2024-10-01 UTC 1 hours POSIXct FALSE
parameter    1  3             NA      NA      NA    NA
                                                           values
geom      POINT (12.91207 47.71819),...,POINT (12.45901 47.73125)
time                                                         NULL
parameter                   temperature, PM2.5      , PM10       {stars}Aggregations
stars object with 2 dimensions and 3 attributes
attribute(s):
                   Min.    1st Qu.    Median     Mean   3rd Qu.      Max. NA's
temperature  6.92500000 12.0009994 14.807767 16.06928 21.175293  24.31104    6
PM2.5        0.06363636  0.8447386  1.359077 12.65861  2.894707  58.88761    6
PM10         0.29340909  1.4850795  3.433376 25.80386  6.734938 128.71625    6
dimension(s):
     from to         offset  delta  refsys point
time    1  3 2024-10-01 UTC 1 days POSIXct    NA
geom    1  7             NA     NA  WGS 84  TRUE
                                                      values
time                                                    NULL
geom POINT (12.91207 47.71819),...,POINT (12.45901 47.73125){cubble}{cubble}# cubble:   key: id [7], index: time, nested form, [sf]
# spatial:  [12.19, 47.62, 13.13, 47.87], WGS 84
# temporal: time [dttm], temperature [[°C]], PM2.5 [[µg/m³]], PM10 [[µg/m³]]
     id  long   lat                geom ts               
* <int> <dbl> <dbl>         <POINT [°]> <list>           
1     1  12.9  47.7 (12.91207 47.71819) <tibble [50 × 4]>
2     2  12.2  47.6 (12.18844 47.62055) <tibble [50 × 4]>
3     3  12.6  47.9 (12.64558 47.86673) <tibble [50 × 4]>
4     4  13.0  47.8 (13.03966 47.82361) <tibble [50 × 4]>
5     5  13.1  47.7 (13.12862 47.65276) <tibble [50 × 4]>
6     6  13.0  47.9  (12.97138 47.8726) <tibble [50 × 4]>
7     7  12.5  47.7 (12.45901 47.73125) <tibble [50 × 4]>{cubble}# cubble:   key: id [7], index: time, long form
# temporal: 2024-10-01 -- 2024-10-03 01:00:00 [1h], no gaps
# spatial:  long [dbl], lat [dbl], geom [POINT [°]]
      id time                temperature   PM2.5    PM10
   <int> <dttm>                     [°C] [µg/m³] [µg/m³]
 1     1 2024-10-01 00:00:00          NA   0.754    1.35
 2     1 2024-10-01 01:00:00          NA   0.707    1.60
 3     1 2024-10-01 02:00:00          NA   0.885    2.05
 4     1 2024-10-01 03:00:00          NA   0.841    1.70
 5     1 2024-10-01 04:00:00          NA   0.878    1.57
 6     1 2024-10-01 05:00:00          NA   0.8      1.61
 7     1 2024-10-01 06:00:00          NA   1.79     3.87
 8     1 2024-10-01 07:00:00          NA   1.15     3.41
 9     1 2024-10-01 08:00:00          NA   1.09     2.44
10     1 2024-10-01 09:00:00          NA   0.865    2.06
# ℹ 340 more rows{cubble}Exploring data cubes for
vector geometries
DYNAMIC SPATIAL DATA in R
Simple feature collection with 25 features and 2 fields
Geometry type: POLYGON
Dimension:     XY
Bounding box:  xmin: -0.2974337 ymin: -0.00297557 xmax: 0.9730806 ymax: 1.153558
Geodetic CRS:  WGS 84
First 10 features:
   gid   datetime                       geometry
1    a 2020-10-01 POLYGON ((0.5474949 0.80889...
2    b 2020-10-01 POLYGON ((0.2791708 0.83373...
3    c 2020-10-01 POLYGON ((0.2807462 0.62779...
4    d 2020-10-01 POLYGON ((0.7650701 0.47444...
5    e 2020-10-01 POLYGON ((0.3825692 0.35378...
6    a 2020-10-02 POLYGON ((0.4961102 0.87283...
7    b 2020-10-02 POLYGON ((0.3298312 0.76120...
8    c 2020-10-02 POLYGON ((0.328914 0.568743...
9    d 2020-10-02 POLYGON ((0.7217233 0.52617...
10   e 2020-10-02 POLYGON ((0.3101455 0.31689...Simple feature collection with 4261 features and 26 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: 407025.2 ymin: 8502331 xmax: 845635.3 ymax: 8968083
Projected CRS: WGS 84 / UTM zone 33N
# A tibble: 4,261 × 27
   NAME     Comment IDENT YEAR_ SOURCE ANALYST YR_UNCERT Shape_Area  FLAG LENGTH
   <chr>    <chr>   <chr> <int> <chr>  <chr>   <chr>          <dbl> <int>  <dbl>
 1 Svitjod… <NA>    1581…  1966 Verti… Max Kö… <NA>       42844621.     0 11747.
 2 Biskaye… <NA>    16118  1966 Verti… Max Kö… <NA>       13012090.     0  3778.
 3 Balberg… <NA>    16601  1966 Verti… Max Kö… <NA>        5004077.     0  3382.
 4 Paradis… <NA>    16201  1966 Verti… Max Kö… <NA>        7632368.     0  5293.
 5 Kvittop… <NA>    1611…  1966 Verti… Max Kö… <NA>        2778791.     0  2613.
 6 Arlabre… <NA>    16119  1966 Verti… Max Kö… <NA>        6158767.     0  5094.
 7 Landbre… <NA>    16603  1966 Verti… Max Kö… <NA>        4330581.     0  4064.
 8 Bybreen  <NA>    16512  1966 Verti… Max Kö… <NA>        2342140.     0  2973.
 9 Evabreen <NA>    16202  1966 Verti… Max Kö… <NA>        6295764.     0  4597.
10 Tindebr… <NA>    16107  1966 Verti… Max Kö… <NA>        1186398.     0  2355.
# ℹ 4,251 more rows
# ℹ 17 more variables: FWIDTH <dbl>, Shape_Peri <dbl>, SOURCE2 <chr>,
#   SatelliteI <chr>, TIDEWATER <int>, NumLines <int>, GLIMSID <chr>,
#   debroxglac <int>, medZ <dbl>, minZ <dbl>, maxZ <dbl>, stdZ <dbl>,
#   skew <dbl>, meanSLP <dbl>, meanASP <dbl>, DEM <chr>,
#   geom <MULTIPOLYGON [m]>Data from König et al. (2014)
Data from König et al. (2014)
{post}post_array OBJECTSstars object with 2 dimensions and 25 attributes
attribute(s):
           geom           NAME              Comment            SOURCE          
 MULTIPOLYGON :33440   Length:33440       Length:33440       Length:33440      
 epsg:32633   :    0   Class :character   Class :character   Class :character  
 +proj=utm ...:    0   Mode  :character   Mode  :character   Mode  :character  
                                                                               
                                                                               
                                                                               
                                                                               
   ANALYST           YR_UNCERT          Shape_Area             FLAG       
 Length:33440       Length:33440       Min.   :4.640e+04   Min.   :0.000  
 Class :character   Class :character   1st Qu.:7.367e+05   1st Qu.:0.000  
 Mode  :character   Mode  :character   Median :2.053e+06   Median :0.000  
                                       Mean   :1.897e+07   Mean   :0.008  
                                       3rd Qu.:7.608e+06   3rd Qu.:0.000  
                                       Max.   :1.243e+09   Max.   :2.000  
                                       NA's   :29179       NA's   :29179  
    LENGTH          FWIDTH          Shape_Peri          SOURCE2         
 Min.   :    0   Min.   :    0.0   Min.   :   936.8   Length:33440      
 1st Qu.: 1277   1st Qu.:  193.0   1st Qu.:  4678.2   Class :character  
 Median : 2276   Median :  332.8   Median :  8861.5   Mode  :character  
 Mean   : 4283   Mean   :  720.5   Mean   : 22962.0                     
 3rd Qu.: 4607   3rd Qu.:  680.0   3rd Qu.: 20465.8                     
 Max.   :64859   Max.   :65609.9   Max.   :643020.8                     
 NA's   :29179   NA's   :29179     NA's   :29179                        
 SatelliteI           TIDEWATER       NumLines        GLIMSID         
 Length:33440       Min.   :0.00    Min.   :0.00    Length:33440      
 Class :character   1st Qu.:0.00    1st Qu.:1.00    Class :character  
 Mode  :character   Median :0.00    Median :1.00    Mode  :character  
                    Mean   :0.12    Mean   :1.27                      
                    3rd Qu.:0.00    3rd Qu.:1.00                      
                    Max.   :1.00    Max.   :9.00                      
                    NA's   :31772   NA's   :31772                     
  debroxglac         medZ             minZ              maxZ        
 Min.   :0.00    Min.   :   0.0   Min.   :   0.00   Min.   :   0.0  
 1st Qu.:0.00    1st Qu.: 349.0   1st Qu.:  70.75   1st Qu.: 538.8  
 Median :0.00    Median : 454.0   Median : 214.00   Median : 688.0  
 Mean   :0.05    Mean   : 484.6   Mean   : 234.13   Mean   : 722.7  
 3rd Qu.:0.00    3rd Qu.: 595.0   3rd Qu.: 354.25   3rd Qu.: 879.0  
 Max.   :1.00    Max.   :1215.0   Max.   :1177.00   Max.   :1691.0  
 NA's   :31772   NA's   :31772    NA's   :31772     NA's   :31772   
     stdZ             skew           meanSLP         meanASP      
 Min.   :  0.00   Min.   :-2.01   Min.   : 0.00   Min.   :  0.00  
 1st Qu.: 73.19   1st Qu.:-0.40   1st Qu.: 7.92   1st Qu.: 55.44  
 Median :103.44   Median :-0.10   Median :12.90   Median :149.46  
 Mean   :109.82   Mean   :-0.11   Mean   :12.71   Mean   :172.08  
 3rd Qu.:137.90   3rd Qu.: 0.20   3rd Qu.:16.61   3rd Qu.:296.63  
 Max.   :341.63   Max.   : 1.91   Max.   :42.83   Max.   :359.74  
 NA's   :31772    NA's   :31772   NA's   :31772   NA's   :31772   
     DEM           
 Length:33440      
 Class :character  
 Mode  :character  
                   
                   
                   
                   
dimension(s):
         from   to                refsys point
geom_sum    1 1672 WGS 84 / UTM zone 33N  TRUE
YEAR_       1   20                    NA FALSE
                                                        values
geom_sum POINT (630823.1 8746977),...,POINT (827599.3 8951776)
YEAR_                              [1936,1960),...,[2010,2011)post_table OBJECTS# cubble:   key: IDENT [1672], index: YEAR_, nested form, [sf]
# spatial:  [409177.49, 8504927.97, 827599.3, 8951776.14], WGS 84 / UTM zone
#   33N
# temporal: Comment [chr], YEAR_ [int], SOURCE [chr], ANALYST [chr], YR_UNCERT
#   [chr], Shape_Area [dbl], FLAG [int], LENGTH [dbl], FWIDTH [dbl], Shape_Peri
#   [dbl], SOURCE2 [chr], SatelliteI [chr], TIDEWATER [int], NumLines [int],
#   GLIMSID [chr], debroxglac [int], medZ [dbl], minZ [dbl], maxZ [dbl], stdZ
#   [dbl], skew [dbl], meanSLP [dbl], meanASP [dbl], DEM [chr], geom
#   [MULTIPOLYGON [m]]
   IDENT   NAME                      x        y           geom_sum ts           
 * <chr>   <chr>                 <dbl>    <dbl>        <POINT [m]> <list>       
 1 11101   Pedasjenkobreen     630823. 8746977. (630823.1 8746977) <sf [2 × 25]>
 2 11102.1 Ganskijbreen        625278. 8747791. (625278.3 8747791) <sf [2 × 25]>
 3 11102.2 Ganskijbreen        623226. 8748129.   (623226 8748129) <sf [2 × 25]>
 4 11103   Sonklarbreen        611755. 8745898. (611755.3 8745898) <sf [2 × 25]>
 5 11104   Helge Backlundbreen 607633. 8735580. (607633.4 8735580) <sf [2 × 25]>
 6 11105.1 Negribreen          578058. 8734897.   (578058 8734897) <sf [2 × 25]>
 7 11105.2 Gardebreen          598368. 8738104. (598368.2 8738104) <sf [2 × 25]>
 8 11106.1 Johansenbreen       580848. 8720961. (580847.7 8720961) <sf [3 × 25]>
 9 11106.2 Petermannbreen      578589. 8715092. (578588.8 8715092) <sf [3 × 25]>
10 11107.1 <NA>                586945. 8713111. (586945.2 8713111) <sf [3 × 25]>
# ℹ 1,662 more rowspost_table OBJECTS# cubble:   key: IDENT [1672], index: YEAR_, long form
# temporal: 1936 -- 2010 [1Y], has gaps!
# spatial:  NAME [chr], x [dbl], y [dbl], geom_sum [POINT [m]]
   IDENT   Comment YEAR_ SOURCE ANALYST YR_UNCERT Shape_Area  FLAG LENGTH FWIDTH
 * <chr>   <chr>   <int> <chr>  <chr>   <chr>          <dbl> <int>  <dbl>  <dbl>
 1 11101   <NA>     1966 Verti… Christ… Can be 1…  55988042.     0 11309.  2129.
 2 11101   <NA>     2008 SPOT5… Christ… <NA>       50864568.     0  9903.  2876.
 3 11102.1 <NA>     1966 Verti… Christ… Can be 1…  12162520.     0  8770.  1218.
 4 11102.1 <NA>     2008 SPOT5… Christ… <NA>       10951911.     0  8373.   758.
 5 11102.2 <NA>     1966 Verti… Christ… Can be 1…   9146020.     0  6927.   482.
 6 11102.2 <NA>     2008 SPOT5… Christ… <NA>        8899379.     0  6236.   351.
 7 11103   <NA>     1966 Verti… Christ… Can be 1… 247724101.     0 17649.  7660.
 8 11103   <NA>     2008 SPOT5… Christ… <NA>      222499321.     0 13799.  7357.
 9 11104   <NA>     1966 Verti… Christ… Can be 1…  23432664.     0  6340.  2576.
10 11104   <NA>     2008 SPOT5… Christ… <NA>       20067760.     0  5659.  1101.
# ℹ 4,251 more rows
# ℹ 16 more variables: Shape_Peri <dbl>, SOURCE2 <chr>, SatelliteI <chr>,
#   TIDEWATER <int>, NumLines <int>, GLIMSID <chr>, debroxglac <int>,
#   medZ <dbl>, minZ <dbl>, maxZ <dbl>, stdZ <dbl>, skew <dbl>, meanSLP <dbl>,
#   meanASP <dbl>, DEM <chr>, geom <MULTIPOLYGON [m]>📝 A post on the contents of this talk and a workflow to fetch and arrange in a VDC the station data from openSenseMap
📕 Spatial Data Science with Applications in R book, especially Chapter 6 - Data Cubes
📦 Documentation on the {stars} package
📦 Documentation on the {cubble} package
📦 Documentation on the {post} package
⚠️The package is work in progress! Any feedback is warmly appreciated.
Cover photo by Eren Namlı on Unsplash