CRU TS provides monthly climate fields at 0.5 degree resolution from 1901 to today. Frequent updates are made available. CRU TS 4.05 contains data up to 2018. ingestr converts CRU variables to rsofun standard variables and units that can then be used as forcing for rsofun. The following describes what precursor variables are used for each rsofun standard variable (and how).

rsofun standard variable (Precursor) CRU variable name(s) Remark
tmin tmn
tmax tmx
prec prc, wtd Weather generator conserving monthly sums and number of monthly wet days
vpd vap, tmin, tmax Using calc_vpd()
ccov cld
ppfd cld and lat, elv Using calc_daily_solar() for theoretical maximum solar radiation, reduced by the cloud cover fraction
patm Using calc_patm() reduced by elevation (and default pressure of 101325 Pa at 0 masl)
## get monthly data (no temporal downscaling - original extracted values)
mdf <- ingest_bysite(
  sitename  = "CH-Lae",
  source    = "cru",
  getvars   = c("tmax", "tmin", "prec", "vpd", "ppfd", "patm"),
  dir       = "/data/archive/cru_harris_2024/data/",
  timescale = "m",
  year_start = 1901,
  year_end  = 2018,
  lon       = 8.365,
  lat       = 47.4781,
  elv       = 689,
  verbose   = FALSE
  )

## get daily data (with temporal downscaling)
ddf <- ingest_bysite(
  sitename  = "CH-Lae",
  source    = "cru",
  getvars   = c("tmax", "tmin", "prec", "vpd", "ppfd", "patm"),
  dir       = "/data/archive/cru_harris_2024/data/",
  timescale = "d",
  year_start = 1901,
  year_end  = 2018,
  lon       = 8.365,
  lat       = 47.4781,
  elv       = 689,
  verbose   = FALSE
  )

Check temporal downscaling

The temporal downscaling conserves monthly means. The following shows monthly tmin values aggregated from downscaled daily values versus values directly extracted from the original files.

mdf_test <- ddf %>% 
  mutate(year = lubridate::year(date), month = lubridate::month(date)) %>% 
  group_by(year, month) %>% 
  summarise(tmin = mean(tmin)) %>% 
  rename(tmin_agg = tmin) %>% 
  ungroup() %>% 
  left_join(mdf %>% 
              select(year, month, tmin_orig = tmin))

gg <- mdf_test %>% analyse_modobs2("tmin_orig", "tmin_agg")
gg$gg + labs(x = "Original monthly tmin (deg C)", y = "Aggregated monthly tmin (deg C)")

For precipitation, the temporal downscaling conserves monthly totals and distributes precipitation to the given number of wet days (also provided by CRU as the number of wet days per month).

mdf_test <- ddf %>% 
  mutate(year = lubridate::year(date), month = lubridate::month(date)) %>% 
  group_by(year, month) %>% 
  summarise(prec = mean(prec)) %>% 
  rename(prec_agg = prec) %>% 
  ungroup() %>% 
  left_join(mdf %>% 
              select(year, month, prec_orig = prec))

gg <- mdf_test %>% analyse_modobs2("prec_orig", "prec_agg")
gg$gg + labs(x = "Original monthly prec (mm)", y = "Aggregated monthly prec (mm)")

Monthly means are further conserved for cloud cover CCOV and consequently for the photosynthetic phothon flux density PPFD. This is because PPFD is a linear function of cloud cover (CRU TS provided, either as daily or monthl cloud cover). Further input factors are time-invariant, such as the elevation ‘elv’ and latitude. The values are derived with the function calc_daily_solar(). Below equations show the linear relationship of PPFD with ‘ccov’: $$ sf = 1 - ccov/100 \\ \tau_o = (kc + kd*sf) \\ \tau = \tau_o*(1 + (2.67 \cdot 10^{-5})*elv) \\ ppfd_{daily} <- (1\cdot 10^{-6})*kfFEC*(1 - kalb_{vis}) \cdot \tau \cdot ra_d $$

mdf_test_ccov <- ddf %>% 
  mutate(year = lubridate::year(date), month = lubridate::month(date)) %>% 
  group_by(year, month) %>% 
  summarise(ccov = mean(ccov)) %>% 
  rename(ccov_agg = ccov) %>% 
  ungroup() %>% 
  left_join(mdf %>% select(year, month, ccov_orig = ccov))
mdf_test_ppfd <- ddf %>% 
  mutate(year = lubridate::year(date), month = lubridate::month(date)) %>% 
  group_by(year, month) %>% 
  summarise(ppfd = mean(ppfd)) %>% 
  rename(ppfd_agg = ppfd) %>% 
  ungroup() %>% 
  left_join(mdf %>% select(year, month, ppfd_orig = ppfd))

gg_ccov <- mdf_test_ccov %>% analyse_modobs2("ccov_orig", "ccov_agg")
gg_ccov$gg + labs(x = "Original monthly CCOV (Percent)", y = "Aggregated monthly CCOV (Percent)")
gg_ppfd <- mdf_test_ppfd %>% analyse_modobs2("ppfd_orig", "ppfd_agg")
gg_ppfd$gg + labs(x = "Original monthly PPFD (mol/m2/s)", y = "Aggregated monthly PPFD (mol/m2/s)")

Monthly means are not conserved for VPD. This is because CRU TS provides vapour pressure (VAP) data and VPD is calculated by ingestr as VPD=(f(VAP,TMIN)+f(VAP,TMAX))/2 VPD = (f(VAP, TMIN) + f(VAP, TMAX))/2

Where ff is a non-linear function (calc_vpd()) and VAP, TMIN, and TMAX are either monthly mean values in case of timescale = "m" or daily values (conserved monthly means) in case of timescale = "d",

mdf_test <- ddf %>% 
  mutate(year = lubridate::year(date), month = lubridate::month(date)) %>% 
  group_by(year, month) %>% 
  summarise(vpd = mean(vpd)) %>% 
  rename(vpd_agg = vpd) %>% 
  ungroup() %>% 
  left_join(mdf %>% 
              select(year, month, vpd_orig = vpd))

gg <- mdf_test %>% analyse_modobs2("vpd_orig", "vpd_agg")
gg$gg + labs(x = "Original monthly VPD (Pa)", y = "Aggregated monthly VPD (Pa)")

Bias correction with WorldClim

Bias correction based on high-resolution WorldClim 1970-2000 monthly climatology is available for variables temp, prec, and vpd.

## get monthly data (no temporal downscaling - original extracted values)
mdf_corr <- ingest_bysite(
  sitename  = "CH-Lae",
  source    = "cru",
  getvars   = c("temp", "tmin", "tmax", "prec", "vpd", "ccov", "ppfd"),
  dir       = "/data/archive/cru_harris_2024/data/",
  timescale = "m",
  year_start = 1901,
  year_end  = 2018,
  lon       = 8.365,
  lat       = 47.4781,
  elv       = 689,
  verbose   = FALSE,
  settings  = list(correct_bias = "worldclim", dir_bias = "/data/archive/worldclim_fick_2017/data")
  )

## get daily data (with temporal downscaling)
ddf_corr <- ingest_bysite(
  sitename  = "CH-Lae",
  source    = "cru",
  getvars   = c("temp", "tmin", "tmax", "prec", "vpd", "ccov", "ppfd"),
  dir       = "/data/archive/cru_harris_2024/data/",
  timescale = "d",
  year_start = 1901,
  year_end  = 2018,
  lon       = 8.365,
  lat       = 47.4781,
  elv       = 689,
  verbose   = FALSE,
  settings  = list(correct_bias = "worldclim", dir_bias = "/data/archive/worldclim_fick_2017/data")
  )

Check conservation of precipitation means after bias correction.

mdf_test <- ddf_corr %>% 
  mutate(year = lubridate::year(date), month = lubridate::month(date)) %>% 
  group_by(year, month) %>% 
  summarise(prec = mean(prec)) %>% 
  rename(prec_agg = prec) %>% 
  ungroup() %>% 
  left_join(mdf_corr %>% 
              select(year, month, prec_orig = prec))

gg <- mdf_test %>% analyse_modobs2("prec_orig", "prec_agg")
gg$gg + labs(x = "Original monthly prec (deg C)", y = "Aggregated monthly prec (deg C)")

Check conservation of VPD means after bias correction.

mdf_test <- ddf_corr %>% 
  mutate(year = lubridate::year(date), month = lubridate::month(date)) %>% 
  group_by(year, month) %>% 
  summarise(vpd = mean(vpd)) %>% 
  rename(vpd_agg = vpd) %>% 
  ungroup() %>% 
  left_join(mdf_corr %>% 
              select(year, month, vpd_orig = vpd))

gg <- mdf_test %>% analyse_modobs2("vpd_orig", "vpd_agg")
gg$gg + labs(x = "Original monthly vpd (Pa)", y = "Aggregated monthly vpd (Pa)")

Check conservation of PPFD means after bias correction.

mdf_test <- ddf_corr %>% 
  mutate(year = lubridate::year(date), month = lubridate::month(date)) %>% 
  group_by(year, month) %>% 
  summarise(ppfd = mean(ppfd)) %>% 
  rename(ppfd_agg = ppfd) %>% 
  ungroup() %>% 
  left_join(mdf_corr %>% 
              select(year, month, ppfd_orig = ppfd))

gg <- mdf_test %>% analyse_modobs2("ppfd_orig", "ppfd_agg")
gg$gg + labs(x = "Original monthly ppfd (mol/m2/s)", y = "Aggregated monthly ppfd (mol/m2/s)")

Check against station data

Comparison of bias-corrected data to FLUXNET site-level observations. For CH-Lae, this is available for 2004-2014. Visualize for three years (2012-2014). Get FLUXNET data.

ddf_fluxnet <- ingestr::ingest(
  siteinfo  = siteinfo_fluxnet2015 %>% dplyr::filter(sitename == "CH-Lae"),
  source    = "fluxnet",
  getvars   = list(temp = "TA_F_DAY", prec = "P_F", vpd  = "VPD_F_DAY", ppfd = "SW_IN_F", patm = "PA_F"),
  dir       = "/data/scratch/bstocker/FLUXNET-2015_Tier1/20191024/DD/",
  settings  = list(dir_hh = "/data/scratch/bstocker/FLUXNET-2015_Tier1/20191024/HH/", getswc = FALSE),
  timescale = "d"
  ) %>% 
  tidyr::unnest(data)

Looks fine for temperature.

ggplot() +
  geom_line(data = ddf_fluxnet %>% 
              dplyr::filter(lubridate::year(date) %in% 2012:2014), 
            aes(date, temp)) +
  geom_line(data = ddf_corr %>% 
              dplyr::filter(lubridate::year(date) %in% 2012:2014), 
            aes(date, temp),
            color = "red")

out <- ddf_fluxnet %>% 
  mutate(year = lubridate::year(date), month = lubridate::month(date)) %>% 
  group_by(year, month) %>% 
  summarise(temp_fluxnet = mean(temp)) %>% 
  left_join(ddf_corr %>% 
              mutate(year = lubridate::year(date), month = lubridate::month(date)) %>% 
              group_by(year, month) %>% 
              summarise(temp_cru_wc = mean(temp)),
            by = c("year", "month")) %>% 
  analyse_modobs2("temp_fluxnet", "temp_cru_wc")
out$gg

Looks fine for precipitation. Compare monthly means - not bad at all!

out <- ddf_fluxnet %>% 
  mutate(year = lubridate::year(date), month = lubridate::month(date)) %>% 
  group_by(year, month) %>% 
  summarise(prec_fluxnet = mean(prec)) %>% 
  left_join(ddf_corr %>% 
              mutate(year = lubridate::year(date), month = lubridate::month(date)) %>% 
              group_by(year, month) %>% 
              summarise(prec_cru_wc = mean(prec)),
            by = c("year", "month")) %>% 
  analyse_modobs2("prec_fluxnet", "prec_cru_wc")
out$gg

Looks fine for VPD

ggplot() +
  geom_line(data = ddf_fluxnet %>% 
              dplyr::filter(lubridate::year(date) %in% 2012:2014), 
            aes(date, vpd)) +
  geom_line(data = ddf_corr %>% 
              dplyr::filter(lubridate::year(date) %in% 2012:2014), 
            aes(date, vpd),
            color = "red")

out <- ddf_fluxnet %>% 
  mutate(year = lubridate::year(date), month = lubridate::month(date)) %>% 
  group_by(year, month) %>% 
  summarise(vpd_fluxnet = mean(vpd)) %>% 
  left_join(ddf_corr %>% 
              mutate(year = lubridate::year(date), month = lubridate::month(date)) %>% 
              group_by(year, month) %>% 
              summarise(vpd_cru_wc = mean(vpd)),
            by = c("year", "month")) %>% 
  analyse_modobs2("vpd_fluxnet", "vpd_cru_wc")
out$gg

Looks fine for PPFD (albeit the monthly bias-correction introduces discontinuous steps between months.)

ggplot() +
  geom_line(data = ddf_fluxnet %>% 
              dplyr::filter(lubridate::year(date) %in% 2012:2014), 
            aes(date, ppfd, color = "fluxnet")) +
  geom_line(data = ddf %>% 
              dplyr::filter(lubridate::year(date) %in% 2012:2014), 
            aes(date, ppfd, color = "CRU downscaled"), 
            linewidth = 1) +
  geom_line(data = ddf_corr %>% 
              dplyr::filter(lubridate::year(date) %in% 2012:2014), 
            aes(date, ppfd, color = "CRU downscaled +\nWorldClim bias-corrected"),
            linewidth = 1) +
  scale_color_manual("", values = c("fluxnet" = "black",
                                    "CRU downscaled" = "skyblue",
                                    "CRU downscaled +\nWorldClim bias-corrected" = "red")) + 
  theme_bw() + theme(legend.position.inside = c(0.02,0.98), legend.justification = c(0,1),
                     legend.position = "inside") +
  scale_x_date("", date_breaks = "6 month", date_minor_breaks = "1 month") +
  labs(y = "ppfd (mol / m2 / s)")

out <- ddf_fluxnet %>% 
  mutate(year = lubridate::year(date), month = lubridate::month(date)) %>% 
  group_by(year, month) %>% 
  summarise(vpd_fluxnet = mean(vpd)) %>% 
  left_join(ddf_corr %>% 
              mutate(year = lubridate::year(date), month = lubridate::month(date)) %>% 
              group_by(year, month) %>% 
              summarise(vpd_cru_wc = mean(vpd)),
            by = c("year", "month")) %>% 
  analyse_modobs2("vpd_fluxnet", "vpd_cru_wc")
out$gg