2 Course plan

For fall semester 2025.

Block 1: Introduction to Applied Geodata Science methods

This first block serves to recap essential machine learning methods that will be applied in the tutorials of subsequent blocks, and to recap methods for implementing a reproducible and reusable data science workflow – required for the submission of the self-guided Report Exercises.

Session 1 - 16.09.

Note: No presence of lecturers. Self-study the following chapters in Applied Geodata Science book:

Primers: Introduces basics of the R programming language and provides instructions for setting up the working environment with RStudio, including the installation of libraries used in this book.
Reusable workflows: Motivates the adoption of reusable, project-oriented workflows and provides a guide for implementing them.
Code Management: Introduces the essentials of code version control using Git.

Session 2 - 23.09.

Welcome and presence of lecturers (as in all subsequent sessions). Self-study the following chapters in Applied Geodata Science book:

Introduction to Applied Geodata Science
Applications of machine learning in Geography and Earth system sciences (lecture)

Session 3 - 30.09.

Self-study the following chapters in Applied Geodata Science book (lecturers present):

Session 4 - 07.10.

Self-study the following chapters in Applied Geodata Science book (lecturers present):

Block 2: Phenology modelling

Lead: Fabian Bernhard

Link to tutorial

In both Block 2 and 4 I will focus on a set of examples which are rather limited in spatial scope, using just a handfull of pixels. In contrast to what the cloud computing revolution promised, many ideas start out from site or localized studies. The focus of these few chapters from the Handful of Pixels book is deliberate. I will show you through both examples in vegetation seasonality (phenology) and land-use and land-cover mapping (Block 4) that you can experiment with geo-spatial data on limited compute infrastructure. Furthermore, it also shows you that true science can be done with relatively modest means. You will learn to do big science with small data.

Session 5 - 14.10.

Introduction to phenology modelling
Self-study of tutorial and exercises

Session 6 - 21.10.

Self-study of tutorial and exercises

Block 3: Digital soil mapping

Lead: Benjamin Stocker

Link to tutorial

Creating soil maps by hand can be tedious and time-consuming and yet they are urgently needed to inform decision-making to prevent soil degradation and to ensure the continuation of important soil-related ecosystem services. This block introduces a workflow for digital soil mapping for using Random Forest models, predicting soil properties across a study area near Bern. The tutorial demonstrates how advances in data availability (continuous maps of climate, terrain, etc.) can be used to predict soil properties that are laborious to obtain from a limited set of soil cores (e.g., pH, water logging, etc.) and for which mapping commonly relied on spatial upscaling “by hand”. The self-guided continuation of the tutorial includes conducting variable selection, model formulation, training, evaluation, and interpretation.

Session 7 - 28.10.

Introduction to digital soil mapping
Self-study of tutorial and exercises

Session 8 - 04.11.

Self-study of tutorial and exercises

Block 4: Land cover classification

Lead: Fabian Bernhard

Link to tutorial

In this block, I introduce Land-Use and Land-Cover mapping. It shows how to train a Land-Use and Land-Cover machine learning algorithm and scale these results to a larger region. The exercises include an amicable competitive component, where a better model needs to be created and results are submitted to a leaderboard.

Session 9 - 11.11.

Introduction to land cover classification
Self-study of tutorial and exercises

Session 10 - 18.11.

Self-study of tutorial and exercises

Block 5: Spatial upscaling

Lead: Benjamin Stocker

Link to tutorial

Creating maps with large spatial, often global coverage based on a limited set of local measurements has become popular. Digital soil mapping led the way by introducing the paradigm that (i) maps can be created based on a model that fits relationships between a locally measured variable of interest and a set of covariates, often environmental variables; and that (ii) global maps of these covariates are available and enable predicting with the fitted model to conditions (locations) for which no local measurements are available. But how reliable are such predictions? And what determines the reliability of predictions to unobserved locations? How can this reliability, the prediction error, be estimated?

With digital soil mapping introduced in Block 3, here we probe its fundamental modelling paradigm - spatial upscaling. We learn how we test a (machine learning) model with a view to what it is used for - the prediction task.

This block serves to critically reflect on working with big data and using (black box) models. It does not introduce entirely new methods, but serves to apply methods learned in previous blocks and in AGDS I for exploring and understanding the benefits and limits of (geo) data science methods. Rather than a tutorial, it comes in the form of literature study and working with the data yourself. All students are required to hand in the Report Exercise of this block.

Session 11 - 25.11.

Introduction to model generalisability and spatial upscaling
Self-study of literature and report exercise

Session 12 - 02.12.

Paper discussion of Ludwig et al. (2023)
Self-study of literature and report exercise

Block 6: Multispectral water stress monitoring

Lead: Benjamin Stocker

Link to tutorial

This block introduces an example of using ML algorithms for multispectral water stress monitoring. We provide an example data set from ongoing research, where the goal is to fit recorded vegetation stress expressed as the fractional Light Use Efficiency (fLUE), observed at eddy covariance sites from the Fluxnet network. This exercise includes again an amicable competitive component, where you submit your prediction results on the test data set to a common leaderboard.

Session 13 - 09.12.

Introduction to multispectral water stress monitoring
Self-study of literature and report exercises

Session 14 - 16.12.

Self-study of literature and report exercises