The rsofun
package and framework includes two main
models. The pmodel
and biomee
(which in part
relies on P-model components). Here we give a short example on how to
run the pmodel
on the included demo datasets to familiarize
yourself with both the data structure and the outputs.
The package includes two demo datasets to run and validate pmodel output using GPP observations. These files can be directly loaded into your workspace by typing:
library(rsofun)
# this is to deal with an error p_model_drivers.rds not being found
p_model_drivers
#> # A tibble: 1 × 4
#> sitename params_siml site_info forcing
#> <chr> <list> <list> <list>
#> 1 FR-Pue <tibble [1 × 11]> <tibble [1 × 4]> <tibble [2,190 × 13]>
p_model_validation
#> # A tibble: 1 × 2
#> sitename data
#> <chr> <list>
#> 1 FR-Pue <tibble [2,190 × 3]>
These are real data from the French FR-Pue fluxnet site. Information
about data structure, variable names, and their meaning and units can be
found in the reference pages of p_model_drivers
and
p_model_validation
. We can use these data to run the model,
together with observations of GPP we can also calibrate
pmodel
parameters.
Another two datasets are provided as an example to validate the model
against leaf traits data, rather than fluxes. Measurements of Vcmax25
(aggregated over species) for a subset of 4 sites from the GlobResp
database (Atkin et al., 2015) are given in
p_model_validation_vcmax25
and the corresponding forcing
for the P-model is given in p_model_drivers_vcmax25
. Since
leaf traits are only measured once per site, the forcing used is a
single year of average climate (the average measurements between 2001
and 2015 on each day of the year).
p_model_drivers_vcmax25
#> # A tibble: 4 × 4
#> sitename params_siml site_info forcing
#> <chr> <list> <list> <list>
#> 1 Reichetal_Colorado <tibble [1 × 11]> <tibble [1 × 6]> <tibble [365 × 13]>
#> 2 Reichetal_New_Mexico <tibble [1 × 11]> <tibble [1 × 6]> <tibble [365 × 13]>
#> 3 Reichetal_Venezuela <tibble [1 × 11]> <tibble [1 × 6]> <tibble [365 × 13]>
#> 4 Reichetal_Wisconsin <tibble [1 × 11]> <tibble [1 × 6]> <tibble [365 × 13]>
p_model_validation_vcmax25
#> # A tibble: 4 × 2
#> # Groups: sitename [4]
#> sitename data
#> <chr> <list>
#> 1 Reichetal_Colorado <tibble [1 × 2]>
#> 2 Reichetal_New_Mexico <tibble [1 × 2]>
#> 3 Reichetal_Venezuela <tibble [1 × 2]>
#> 4 Reichetal_Wisconsin <tibble [1 × 2]>
For the remainder of this vignette, we will use the GPP flux datasets. The workflow is exactly the same for leaf traits data.
To get your raw data into the structure used within
rsofun
, please see R packages ingestr and FluxDataKit.
With all data prepared we can run the P-model using
runread_pmodel_f()
. This function takes the nested data
structure and runs the model site by site, returning nested model output
results matching the input drivers.
# define model parameter values from previous
# work
params_modl <- list(
kphio = 0.04998, # setup ORG in Stocker et al. 2020 GMD
kphio_par_a = 0.0, # set to zero to disable temperature-dependence of kphio
kphio_par_b = 1.0,
soilm_thetastar = 0.6 * 240, # to recover old setup with soil moisture stress
soilm_betao = 0.0,
beta_unitcostratio = 146.0,
rd_to_vcmax = 0.014, # value from Atkin et al. 2015 for C3 herbaceous
tau_acclim = 30.0,
kc_jmax = 0.41
)
# run the model for these parameters
output <- rsofun::runread_pmodel_f(
p_model_drivers,
par = params_modl
)
We can now visualize both the model output and the measured values together.
# Load libraries for plotting
library(dplyr)
library(tidyr)
library(ggplot2)
# Create data.frame for plotting
df_gpp_plot <- rbind(
output |>
filter(sitename == "FR-Pue") |>
unnest(data) |>
select(date, gpp) |>
mutate(type = "P-model output"),
p_model_validation |>
filter(sitename == "FR-Pue") |>
unnest(data) |>
select(date, gpp) |>
mutate(type = "Observed")
)
df_gpp_plot$type <- factor(df_gpp_plot$type,
levels = c('P-model output',
'Observed'))
# Plot GPP
ggplot(data = df_gpp_plot) +
geom_line(
aes(x = date,
y = gpp,
color = type),
alpha = 0.7
) +
scale_color_manual(values = c(
'P-model output'='grey70',
'Observed'='black')) +
theme_classic() +
theme(panel.grid.major.y = element_line()) +
labs(
x = 'Date',
y = expression(paste("GPP (g C m"^-2, "s"^-1, ")")),
colour = ""
)
To optimize new parameters based upon driver data and a validation dataset we must first specify an optimization strategy and settings, as well as a cost function and parameter ranges.
settings <- list(
method = "GenSA",
metric = cost_rmse_pmodel,
control = list(
maxit = 100),
par = list(
kphio = list(lower=0.02, upper=0.2, init = 0.05)
)
)
rsofun
supports both optimization using the
GenSA
and BayesianTools
packages. The above
statement provides settings for a GenSA
optimization
approach. For this example the maximum number of iterations is kept
artificially low. In a real scenario you will have to increase this
value orders of magnitude. Keep in mind that optimization routines rely
on a cost function, which, depending on its structure influences
parameter selection. A limited set of cost functions is provided but the
model structure is transparent and custom cost functions can be easily
written. More details can be found in the “Parameter calibration and
cost functions” vignette.
In addition starting values and ranges are provided for the free
parameters in the model. Free parameters include: parameters for the
quantum yield efficiency kphio
, kphio_par_a
and kphio_par_b
, soil moisture stress parameters
soilm_thetastar
and soilm_betao
, and also
beta_unitcostratio
, rd_to_vcmax
,
tau_acclim
and kc_jmax
(see
?runread_pmodel_f
). Be mindful that with newer versions of
rsofun
additional parameters might be introduced, so
re-check vignettes and function documentation when updating existing
code.
With all settings defined the optimization function
calib_sofun()
can be called with driver data and
observations specified. Extra arguments for the cost function (like what
variable should be used as target to compute the root mean squared error
(RMSE) and previous values for the parameters that aren’t calibrated,
which are needed to run the P-model).
# calibrate the model and optimize free parameters
pars <- calib_sofun(
drivers = p_model_drivers,
obs = p_model_validation,
settings = settings,
# extra arguments passed to the cost function:
targets = "gpp", # define target variable GPP
par_fixed = params_modl[-1] # fix non-calibrated parameters to previous
# values, removing kphio
)
When successful the optimized parameters can be used to run subsequent modelling efforts, in this case slightly improving the model fit over a more global parameter set.
# Update the parameter list with calibrated value
params_modl$kphio <- pars$par["kphio"]
# Run the model for these parameters
output_new <- rsofun::runread_pmodel_f(
p_model_drivers,
par = params_modl
)
# Update data.frame for plotting
df_gpp_plot <- rbind(
df_gpp_plot,
output_new |>
filter(sitename == "FR-Pue") |>
unnest(data) |>
select(date, gpp) |>
mutate(type = "P-model output (calibrated)")
)
df_gpp_plot$type <- factor(df_gpp_plot$type,
levels = c('P-model output',
'P-model output (calibrated)',
'Observed'))
# Plot GPP
ggplot(data = df_gpp_plot) +
geom_line(
aes(x = date,
y = gpp,
color = type),
alpha = 0.7
) +
scale_color_manual(values = c(
'P-model output'='grey70',
'P-model output (calibrated)'='grey40',
'Observed'='black')) +
theme_classic() +
theme(panel.grid.major.y = element_line()) +
labs(
x = 'Date',
y = expression(paste("GPP (g C m"^-2, "s"^-1, ")")),
colour = ""
)
For details on the optimization settings we refer to the manuals of GenSA and BayesianTools.