vignettes/collect_point_environment.Rmd
collect_point_environment.Rmd
The package ingestr makes data collection for geographic locations easy and reproducible. This vignette provides an example for collecting climate and soil information for a set of variables which may be considered in analyses of environment-vegetation relationships.
The only information required for running point simulations with the
P-model is the geographic position of the sites and their elevation. A
site name is required as an ID for the output. The site meta information
has to be provided as a data frame containing the following columns:
sitename, lon, lat, elv
.
Let’s use the FLUXNET2015 site info data frame from ingestr as an example here.
## # A tibble: 3 × 3
## sitename lon lat
## <chr> <dbl> <dbl>
## 1 AR-SLu -66.5 -33.5
## 2 AR-Vir -56.2 -28.2
## 3 AT-Neu 11.3 47.1
ingestr makes it easy to collect WorldClim data from files available
locally (in directory ~/data/worldclim
). WorldClim provides
monthly climatologies of different meteorological variables, Here, we
collect them and aggregate them over the thermal growing season, i.e.,
over months where the growth temperature is above 0 deg C, and convert
them to units and variables as used also for inputs to the P-model (see
here). This includes the following (non-trivial)
calculations:
ingestr::calc_vpd()
(see
code below).?ingestr::calc_tgrowth
.Next, we aggregate the monthly WorldClim climatology to means across
the thermal growing season, i.e., over months where the growth
temperature is above 0 deg C. Before aggregation, we also convert
WorldClim variables to the units required for rpmodel
. See
?rpmodel
for information about the arguments.
get_growingseasonmean <- function(df){
df |>
filter(tgrowth > 0) |>
ungroup() |>
summarise(across(c(tgrowth, vpd, ppfd), mean))
}
kfFEC <- 2.04
df_wc <- df_wc |>
unnest(data) |>
## add latitude
left_join(df_sites, by = "sitename") |>
## vapour pressure kPa -> Pa
mutate(vapr = vapr * 1e3) |>
## PPFD from solar radiation: kJ m-2 day-1 -> mol m−2 s−1 PAR
mutate(ppfd = 1e3 * srad * kfFEC * 1.0e-6 / (60 * 60 * 24)) |>
## calculate VPD (Pa) based on tmin and tmax
rowwise() |>
mutate(vpd = ingestr::calc_vpd(eact = vapr, tmin = tmin, tmax = tmax)) |>
## calculate growth temperature (average daytime temperature)
mutate(doy = lubridate::yday(lubridate::ymd("2001-01-15") + months(month - 1))) |>
mutate(tgrowth = ingestr::calc_tgrowth(tmin, tmax, lat, doy)) |>
## average over growing season (where Tgrowth > 0 deg C)
group_by(sitename) |>
nest() |>
mutate(data_growingseason = purrr::map(data, ~get_growingseasonmean(.))) |>
unnest(data_growingseason) |>
select(-data)
df_wc
settings_wise <- get_settings_wise(varnam = c("CNrt"), layer = 1:3)
df_wise <- ingest(
df_sites,
source = "wise",
settings = settings_wise,
dir = "~/data/soil/wise"
)
df_wise
Can be obtained from HWSD. How to use rhwsd package? Not as documented in ingestr link?
This requires start and end years to be specified. Let’s get data from 1990 to 2009 and then calculate the mean annual total.
df_ndep <- ingest(
df_sites |>
mutate(year_start = 1990, year_end = 2009),
source = "ndep",
timescale = "y",
dir = "~/data/ndep_lamarque/",
verbose = FALSE
) |>
unnest(cols = data) |>
group_by(sitename) |>
summarise(noy = mean(noy), nhx = mean(nhx)) |>
mutate(ndep = noy + nhx) |>
select(-noy, -nhx)
df_ndep
Let’s combine the data frames collected above into a single data
frame. Make sure that all data frames use the same unique identifier as
the column named sitename
. Make all data frames flat before
(unnest()
) and avoid duplicate columns in joined data
frames.