--- title: "Using the GRTS data sources" author: "Floris Vanderhaeghe" date: "2023-11-24" bibliography: references.bib output: rmarkdown::html_vignette: pandoc_args: - --csl - research-institute-for-nature-and-forest.csl vignette: > %\VignetteIndexEntry{030. Using the GRTS data sources} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include = FALSE} options(stringsAsFactors = FALSE) knitr::opts_chunk$set( collapse = TRUE, comment = "#>", eval = FALSE ) library(n2khab) library(magrittr) library(knitr) ``` _General note: the below vignette contains frozen output of 24 November 2023._ _This makes it possible to build the package with vignettes without access to the data sources._ ## Intro With this vignette, you get acquainted with three functions that usually return a SpatRaster object from the {terra} package: - `read_GRTSmh()` - `read_GRTSmh_base4frac()` - `read_GRTSmh_diffres()` The data source `GRTSmaster_habitats`, provided and documented in [Zenodo](https://doi.org/10.5281/zenodo.2611233), is a monolayered GeoTIFF file covering the whole of Flanders and the Brussels Capital Region at a resolution of 32 m. Its values are unique decimal integer ranking numbers from the GRTS algorithm applied to the Flemish and Brussels area. Beware that not all GRTS ranking numbers are present in the data source, as the original GRTS raster has been clipped with the Flemish outer borders (i.e., not excluding the Brussels Capital Region). The GRTS algorithm uses a quadrant-recursive, hierarchically randomized function that maps the unit square to the unit interval, resulting in a base-4 GRTS address for each location. The ranking numbers in `GRTSmaster_habitats` are base-10 numbers and follow the reverse hierarchical order: **each consecutive subset** of ranking numbers corresponds to a **spatially balanced** sample of locations. Hence, it allows **dynamical** sample sizes. More information on the GRTS algorithm can be found in @stevens_variance_2003 [-@stevens_spatially_2004] and in the [GRTS](https://github.com/ThierryO/grts) and [spsurvey](https://CRAN.R-project.org/package=spsurvey) packages. ## Data sources The following data sources are available: - the raw data source `GRTSmaster_habitats`, discussed above and available at [Zenodo](https://doi.org/10.5281/zenodo.2611233) - processed data sources: - `GRTSmh_brick` ([Zenodo-link](https://doi.org/10.5281/zenodo.3354403)): 10-layered GeoTIFF with the decimal integer ranking numbers of 10 hierarchical levels (0 - 9) of the GRTS cell addresses, including the one from `GRTSmaster_habitats` (i.e. level 0; for more details see the `read_GRTSmh()` documentation) - `GRTSmh_diffres` ([Zenodo-link](https://doi.org/10.5281/zenodo.3354405)): file collection composed of nine monolayered GeoTIFF files plus a GeoPackage with six polygon layers. They provide the hierarchical levels 1 to 9 of the `GRTSmh_brick` data source at the corresponding spatial resolution, i.e. at lower resolutions than `GRTSmaster_habitats` (for more details see the `read_GRTSmh_diffres()` documentation) - `GRTSmh_base4frac` ([Zenodo-link](https://doi.org/10.5281/zenodo.3354401)): is like a mirror to `GRTSmaster_habitats`, holding the ranking numbers as base 4 fractions. These are numbers like `0.3213210231312`, representing the reverse-ordered base-4 GRTS address behind the decimal mark: the digit for level 0 is 2, for level 1 it is 1, ..., for level 13 it is 3). Hence, it is a direct representation of the hierarchical GRTS addresses, allowing the derivation of other datasets. More details are in the `read_GRTSmh_base4frac()` documentation. For more information on data storage and locations, see `vignette("v020_datastorage")`. ## Get the data in R ### read_GRTSmh() In the below R code, it is supposed that a `n2khab_data` folder is present in the current directory or up to 10 levels higher. See the `vignette("v020_datastorage")` for more information. `read_GRTSmh()` by default returns the `GRTSmaster_habitats` dataset: ```{r} read_GRTSmh() #> class : SpatRaster #> dimensions : 2843, 7401, 1 (nrow, ncol, nlyr) #> resolution : 32, 32 (x, y) #> extent : 22029.59, 258861.6, 153054.1, 244030.1 (xmin, xmax, ymin, ymax) #> coord. ref. : BD72 / Belgian Lambert 72 (EPSG:31370) #> source : GRTSmaster_habitats.tif #> name : GRTSmaster_habitats #> min value : 1 #> max value : 67108857 ``` With the argument `brick = TRUE` however, you will get the `GRTSmh_brick` data source, i.e. `GRTSmaster_habitats` plus 9 extra layers: ```{r} r10 <- read_GRTSmh(brick = TRUE) r10 #> class : SpatRaster #> dimensions : 2843, 7401, 10 (nrow, ncol, nlyr) #> resolution : 32, 32 (x, y) #> extent : 22029.59, 258861.6, 153054.1, 244030.1 (xmin, xmax, ymin, ymax) #> coord. ref. : BD72 / Belgian Lambert 72 (EPSG:31370) #> source : GRTSmh_brick.tif #> names : level0, level1, level2, level3, level4, level5, ... #> min values : 1, 1, 1, 1, 1, 1, ... #> max values : 67108857, 16777209, 4194297, 1048569, 262137, 65529, ... terra::minmax(r10) #> level0 level1 level2 level3 level4 level5 level6 level7 level8 level9 #> min 1 1 1 1 1 1 1 1 1 1 #> max 67108857 16777209 4194297 1048569 262137 65529 16377 4089 1017 253 ``` The layers with higher-level ranking numbers allow spatially balanced samples at lower spatial resolution than that of 32 m, and can also be used for aggregation purposes. The provided hierarchical levels correspond to the resolution vector `32 * 2^(0:9)` (minimum: 32 meters, maximum: 16384 meters), with the corresponding layers named as `level0` to `level9`. ### read_GRTSmh_diffres() `read_GRTSmh_diffres()` by default returns one raster layer from the `GRTSmh_diffres` data source, i.e. with the GRTS ranking numbers of the user-specified hierarchical level. This is done _at the corresponding spatial resolution_ of the GRTS algorithm, which is the fundamental distinction from `read_GRTSmh(brick = TRUE)`. The resolutions of each level are the following (in meters): ```{r warning = FALSE, echo = FALSE, eval = TRUE} data.frame( level = 1:9, resolution = 32 * 2^(1:9) ) %>% kable(align = "r") ``` An example with level 5: ```{r} read_GRTSmh_diffres(level = 5) #> class : SpatRaster #> dimensions : 89, 232, 1 (nrow, ncol, nlyr) #> resolution : 1024, 1024 (x, y) #> extent : 22030, 259598, 153054, 244190 (xmin, xmax, ymin, ymax) #> coord. ref. : BD72 / Belgian Lambert 72 (EPSG:31370) #> source : GRTSmh_diffres.5.tif #> name : level5 #> min value : 1 #> max value : 65529 ``` Alternatively, a dissolved, polygonized variant of the corresponding `GRTSmh_brick` level can be returned as an `sf` object. In order not to inflate the data source, this was only made available for levels 4 to 9. ```{r} read_GRTSmh_diffres(level = 5, polygon = TRUE) #> Simple feature collection with 13791 features and 1 field #> Geometry type: POLYGON #> Dimension: XY #> Bounding box: xmin: 22029.59 ymin: 153054.1 xmax: 258861.6 ymax: 244030.1 #> Projected CRS: BD72 / Belgian Lambert 72 #> # A tibble: 13,791 × 2 #> value geom #> * #> 1 23390 ((178093.6 244030.1, 178093.6 243998.1, 178061.6 243998.1, 177997.6 24… #> 2 56158 ((178701.6 243646.1, 178701.6 243166.1, 179725.6 243166.1, 179725.6 24… #> 3 23134 ((177581.6 243870.1, 177581.6 243838.1, 177549.6 243838.1, 177485.6 24… #> 4 60254 ((179757.6 243454.1, 179757.6 243422.1, 179725.6 243422.1, 179725.6 24… #> 5 6750 ((176621.6 243390.1, 176621.6 243358.1, 176589.6 243358.1, 176589.6 24… #> 6 60254 ((180333.6 243230.1, 180333.6 243198.1, 180365.6 243198.1, 180365.6 24… #> 7 58718 ((176653.6 243166.1, 176653.6 242142.1, 177677.6 242142.1, 177677.6 24… #> 8 52318 ((177677.6 243166.1, 177677.6 242142.1, 178701.6 242142.1, 178701.6 24… #> 9 3166 ((178701.6 243166.1, 178701.6 242142.1, 179725.6 242142.1, 179725.6 24… #> 10 15454 ((179725.6 243166.1, 179725.6 242142.1, 180749.6 242142.1, 180749.6 24… #> # ℹ 13,781 more rows ``` ### read_GRTSmh_base4frac() Its use is just to return the base-4-fraction-converted `GRTSmaster_habitats` as a SpatRaster object: ```{r} options(scipen = 999, digits = 15) read_GRTSmh_base4frac() #> class : SpatRaster #> dimensions : 2843, 7401, 1 (nrow, ncol, nlyr) #> resolution : 32, 32 (x, y) #> extent : 22029.591973471, 258861.591973471, 153054.113583292, 244030.113583292 (xmin, xmax, ymin, ymax) #> coord. ref. : BD72 / Belgian Lambert 72 (EPSG:31370) #> source : GRTSmh_base4frac.tif #> name : GRTSmh_base4frac #> min value : 0.0000000000001 #> max value : 0.3333333333321 ``` Note that the used options are necessary when treating these base-4-fraction GRTS addresses as characters; otherwise scientific notations will be used. Also, be warned that R does not actually regard the values as base 4, but as base 10. [^base4] [^base4]: So, what really matters is only the notation with many digits, to be _regarded_ as a base 4 fraction (and hence, handling it as a character in conversions is often necessary). The `n2khab` package also exports a `convert_dec_to_base4frac()` and a `convert_base4frac_to_dec()` function in its namespace. These functions will be relevant if you need to do such conversions yourself, and they are used in the code to generate the processed data sources. ## References