Extracts full time series from a list of NetCDF files, provided for time steps (can be one time step or multiple time steps) separately and optionally writes.rds files for each longitude index instead of returning the whole data.

map2tidy(
  nclist,
  varnam,
  lonnam = "lon",
  latnam = "lat",
  timenam = NA,
  do_chunks = FALSE,
  outdir = NA,
  fileprefix = NA,
  ncores = 1,
  fgetdate = NA,
  overwrite = FALSE,
  filter_lon_between_degrees = NA
)

Arguments

nclist

A vector of character strings specifying the complete paths to files.

varnam

The variable name(s) for which data is to be read from the NetCDF files.

lonnam

The dimension name of longitude in the NetCDF files.

latnam

The dimension name of latitude in the NetCDF files.

timenam

The name of dimension variable used for time in the NetCDF files. Defaults to NA.

do_chunks

A logical specifying whether chunks of data should be written to files. Defaults to FALSE. If set to TRUE, the arguments outdir and fileprefix must be specified. Chunks are longitudinal bands and the number of chunks corresponds to the number length of the longitude dimension.

outdir

A character string specifying output directory where data frames are written using the save statement. If omitted (defaults to NA), a tidy data frame containing all data is returned.

fileprefix

A character string specifying the file name prefix.

ncores

Number of cores for parallel execution (distributing extraction of longitude slices). When set to "all", the number of cores for parallelisation is determined by length(parallelly::availableWorkers()). Defaults to 1 (no parallelisation).

fgetdate

A function to derive the date used for the time dimension based on the file name.

overwrite

A logical indicating whether time series files are to be overwritten.

filter_lon_between_degrees

Either NA (default) or a vector of two numbers c(lower, upper) that define a range of longitude values to process, e.g. c(-70, -68).

Value

Generates a tibble (containing columns 'lon' (double), 'lat' (double), and nested column 'data'). Column 'data' contains requested variables (probably as doubles) and potentially a column 'datetime' (as string). Note that the datetime is defined by package CFtime and can contain dates such as "2021-02-30", which are valid for 360-day calendars but not for POSIXt. Because of that these dates need to be parsed separately. The function either returns this tibble or then (if out_dir is specified) returns nothing and writes the tibble to .rds files for each longitude value.