Using the GRTS data sources

General note: the below vignette contains frozen output of 24 November 2023. This makes it possible to build the package with vignettes without access to the data sources.

Intro

With this vignette, you get acquainted with three functions that usually return a SpatRaster object from the {terra} package:

  • read_GRTSmh()
  • read_GRTSmh_base4frac()
  • read_GRTSmh_diffres()

The data source GRTSmaster_habitats, provided and documented in Zenodo, is a monolayered GeoTIFF file covering the whole of Flanders and the Brussels Capital Region at a resolution of 32 m. Its values are unique decimal integer ranking numbers from the GRTS algorithm applied to the Flemish and Brussels area. Beware that not all GRTS ranking numbers are present in the data source, as the original GRTS raster has been clipped with the Flemish outer borders (i.e., not excluding the Brussels Capital Region).

The GRTS algorithm uses a quadrant-recursive, hierarchically randomized function that maps the unit square to the unit interval, resulting in a base-4 GRTS address for each location. The ranking numbers in GRTSmaster_habitats are base-10 numbers and follow the reverse hierarchical order: each consecutive subset of ranking numbers corresponds to a spatially balanced sample of locations. Hence, it allows dynamical sample sizes. More information on the GRTS algorithm can be found in Stevens and Olsen (2003, 2004) and in the GRTS and spsurvey packages.

Data sources

The following data sources are available:

  • the raw data source GRTSmaster_habitats, discussed above and available at Zenodo
  • processed data sources:
    • GRTSmh_brick (Zenodo-link): 10-layered GeoTIFF with the decimal integer ranking numbers of 10 hierarchical levels (0 - 9) of the GRTS cell addresses, including the one from GRTSmaster_habitats (i.e. level 0; for more details see the read_GRTSmh() documentation)
    • GRTSmh_diffres (Zenodo-link): file collection composed of nine monolayered GeoTIFF files plus a GeoPackage with six polygon layers. They provide the hierarchical levels 1 to 9 of the GRTSmh_brick data source at the corresponding spatial resolution, i.e. at lower resolutions than GRTSmaster_habitats (for more details see the read_GRTSmh_diffres() documentation)
    • GRTSmh_base4frac (Zenodo-link): is like a mirror to GRTSmaster_habitats, holding the ranking numbers as base 4 fractions. These are numbers like 0.3213210231312, representing the reverse-ordered base-4 GRTS address behind the decimal mark: the digit for level 0 is 2, for level 1 it is 1, …, for level 13 it is 3). Hence, it is a direct representation of the hierarchical GRTS addresses, allowing the derivation of other datasets. More details are in the read_GRTSmh_base4frac() documentation.

For more information on data storage and locations, see vignette("v020_datastorage").

Get the data in R

read_GRTSmh()

In the below R code, it is supposed that a n2khab_data folder is present in the current directory or up to 10 levels higher. See the vignette("v020_datastorage") for more information.

read_GRTSmh() by default returns the GRTSmaster_habitats dataset:

read_GRTSmh()
#> class       : SpatRaster
#> dimensions  : 2843, 7401, 1  (nrow, ncol, nlyr)
#> resolution  : 32, 32  (x, y)
#> extent      : 22029.59, 258861.6, 153054.1, 244030.1  (xmin, xmax, ymin, ymax)
#> coord. ref. : BD72 / Belgian Lambert 72 (EPSG:31370)
#> source      : GRTSmaster_habitats.tif
#> name        : GRTSmaster_habitats
#> min value   :                   1
#> max value   :            67108857

With the argument brick = TRUE however, you will get the GRTSmh_brick data source, i.e. GRTSmaster_habitats plus 9 extra layers:

r10 <- read_GRTSmh(brick = TRUE)
r10
#> class       : SpatRaster
#> dimensions  : 2843, 7401, 10  (nrow, ncol, nlyr)
#> resolution  : 32, 32  (x, y)
#> extent      : 22029.59, 258861.6, 153054.1, 244030.1  (xmin, xmax, ymin, ymax)
#> coord. ref. : BD72 / Belgian Lambert 72 (EPSG:31370)
#> source      : GRTSmh_brick.tif
#> names       :   level0,   level1,  level2,  level3, level4, level5, ...
#> min values  :        1,        1,       1,       1,      1,      1, ...
#> max values  : 67108857, 16777209, 4194297, 1048569, 262137,  65529, ...
terra::minmax(r10)
#>       level0   level1  level2  level3 level4 level5 level6 level7 level8 level9
#> min        1        1       1       1      1      1      1      1      1      1
#> max 67108857 16777209 4194297 1048569 262137  65529  16377   4089   1017    253

The layers with higher-level ranking numbers allow spatially balanced samples at lower spatial resolution than that of 32 m, and can also be used for aggregation purposes. The provided hierarchical levels correspond to the resolution vector 32 * 2^(0:9) (minimum: 32 meters, maximum: 16384 meters), with the corresponding layers named as level0 to level9.

read_GRTSmh_diffres()

read_GRTSmh_diffres() by default returns one raster layer from the GRTSmh_diffres data source, i.e. with the GRTS ranking numbers of the user-specified hierarchical level. This is done at the corresponding spatial resolution of the GRTS algorithm, which is the fundamental distinction from read_GRTSmh(brick = TRUE).

The resolutions of each level are the following (in meters):

level resolution
1 64
2 128
3 256
4 512
5 1024
6 2048
7 4096
8 8192
9 16384

An example with level 5:

read_GRTSmh_diffres(level = 5)
#> class       : SpatRaster
#> dimensions  : 89, 232, 1  (nrow, ncol, nlyr)
#> resolution  : 1024, 1024  (x, y)
#> extent      : 22030, 259598, 153054, 244190  (xmin, xmax, ymin, ymax)
#> coord. ref. : BD72 / Belgian Lambert 72 (EPSG:31370)
#> source      : GRTSmh_diffres.5.tif
#> name        : level5
#> min value   :      1
#> max value   :  65529

Alternatively, a dissolved, polygonized variant of the corresponding GRTSmh_brick level can be returned as an sf object. In order not to inflate the data source, this was only made available for levels 4 to 9.

read_GRTSmh_diffres(level = 5, polygon = TRUE)
#> Simple feature collection with 13791 features and 1 field
#> Geometry type: POLYGON
#> Dimension:     XY
#> Bounding box:  xmin: 22029.59 ymin: 153054.1 xmax: 258861.6 ymax: 244030.1
#> Projected CRS: BD72 / Belgian Lambert 72
#> # A tibble: 13,791 × 2
#>    value                                                                    geom
#>  * <int>                                                           <POLYGON [m]>
#>  1 23390 ((178093.6 244030.1, 178093.6 243998.1, 178061.6 243998.1, 177997.6 24…
#>  2 56158 ((178701.6 243646.1, 178701.6 243166.1, 179725.6 243166.1, 179725.6 24…
#>  3 23134 ((177581.6 243870.1, 177581.6 243838.1, 177549.6 243838.1, 177485.6 24…
#>  4 60254 ((179757.6 243454.1, 179757.6 243422.1, 179725.6 243422.1, 179725.6 24…
#>  5  6750 ((176621.6 243390.1, 176621.6 243358.1, 176589.6 243358.1, 176589.6 24…
#>  6 60254 ((180333.6 243230.1, 180333.6 243198.1, 180365.6 243198.1, 180365.6 24…
#>  7 58718 ((176653.6 243166.1, 176653.6 242142.1, 177677.6 242142.1, 177677.6 24…
#>  8 52318 ((177677.6 243166.1, 177677.6 242142.1, 178701.6 242142.1, 178701.6 24…
#>  9  3166 ((178701.6 243166.1, 178701.6 242142.1, 179725.6 242142.1, 179725.6 24…
#> 10 15454 ((179725.6 243166.1, 179725.6 242142.1, 180749.6 242142.1, 180749.6 24…
#> # ℹ 13,781 more rows

read_GRTSmh_base4frac()

Its use is just to return the base-4-fraction-converted GRTSmaster_habitats as a SpatRaster object:

options(scipen = 999, digits = 15)
read_GRTSmh_base4frac()
#> class       : SpatRaster
#> dimensions  : 2843, 7401, 1  (nrow, ncol, nlyr)
#> resolution  : 32, 32  (x, y)
#> extent      : 22029.591973471, 258861.591973471, 153054.113583292, 244030.113583292  (xmin, xmax, ymin, ymax)
#> coord. ref. : BD72 / Belgian Lambert 72 (EPSG:31370)
#> source      : GRTSmh_base4frac.tif
#> name        : GRTSmh_base4frac
#> min value   :  0.0000000000001
#> max value   :  0.3333333333321

Note that the used options are necessary when treating these base-4-fraction GRTS addresses as characters; otherwise scientific notations will be used.

Also, be warned that R does not actually regard the values as base 4, but as base 10. 1

References

Stevens, Don L., and Anthony R. Olsen. 2003. “Variance Estimation for Spatially Balanced Samples of Environmental Resources.” Environmetrics 14 (6): 593–610. https://doi.org/10.1002/env.606.
———. 2004. “Spatially Balanced Sampling of Natural Resources.” Journal of the American Statistical Association 99 (465): 262–78. https://doi.org/10.1198/016214504000000250.

  1. So, what really matters is only the notation with many digits, to be regarded as a base 4 fraction (and hence, handling it as a character in conversions is often necessary). The n2khab package also exports a convert_dec_to_base4frac() and a convert_base4frac_to_dec() function in its namespace. These functions will be relevant if you need to do such conversions yourself, and they are used in the code to generate the processed data sources.↩︎