2  Course plan

For fall semester 2023.

Block 1: Introduction to Applied Geodata Science methods

This first block serves to recap essential machine learning methods that will be applied in the tutorials of subsequent blocks, and to recap methods for implementing a reproducible data science workflow - required for the submission of the self-guided project report.

Contents are covered in the Applied Geodata Science book.

Session 1 - 21.09.

Session 2 - 28.09.

Session 3 - 05.10.

  • Supervised machine learning (AGDS book Chapter 9 and 10)

Session 4 - 12.10.

Block 2: Phenology modelling

Lead: Koen Hufkens

Link to tutorial

In both Block 2 and 4 I will focus on a set of examples which are rather limited in spatial scope, using just a handfull of pixels. In contrast to what the cloud computing revolution promised, many ideas start out from site or localized studies. The focus of these few chapters from the Handful of Pixels book is deliberate. I will show you through both examples in vegetation seasonality (phenology) and land-use and land-cover mapping (Block 4) that you can experiment with geo-spatial data on limited compute infrastructure. Furthermore, it also shows you that true science can be done with relatively modest means. You will learn to do big science with small data.

Session 5 - 19.10.

  • Introduction to phenology modelling
  • Self-study of tutorial and exercises

Session 6 - 26.10.

  • Self-study of tutorial and exercises

Block 3: Digital soil mapping

Lead: Benjamin Stocker

Link to tutorial

Creating soil maps by hand can be tedious and time-consuming and yet they are urgently needed to inform decision-making to prevent soil degradation and to ensure the continuation of important soil-related ecosystem services. This block introduces a workflow for digital soil mapping for using Random Forest models, predicting soil properties across a study area near Bern. The tutorial demonstrates how advances in data availability (continuous maps of climate, terrain, etc.) can be used to predict soil properties that are laborious to obtain from a limited set of soil cores (e.g., pH, water logging, etc.) and for which mapping commonly relied on spatial upscaling “by hand”. The self-guided continuation of the tutorial includes conducting variable selection, model formulation, training, evaluation, and interpretation.

Session 7 - 02.11.

  • Introduction to digital soil mapping
  • Self-study of tutorial and exercises

Session 8 - 09.11.

  • Self-study of tutorial and exercises

Block 4: Land cover classification

Lead: Koen Hufkens

Link to tutorial

In this block, I introduce Land-Use and Land-Cover mapping. It shows how to train a Land-Use and Land-Cover machine learning algorithm and scale these results to a larger region. The exercises include an amicable competitive component, where a better model needs to be created and results are submitted to a leaderboard (see Chapter 3 for details).

Session 9 - 16.11.

  • Introduction to land cover classification
  • Self-study of tutorial and exercises

Session 10 - 23.11.

  • Self-study of tutorial and exercises

Block 5: Spatial upscaling

Lead: Benjamin Stocker

Link to tutorial

Creating maps with large spatial, often global coverage based on a limited set of local measurements has become popular. Digital soil mapping led the way by introducing the paradigm that (i) maps can be created based on a model that fits relationships between a locally measured variable of interest and a set of covariates, often environmental variables; and that (ii) global maps of these covariates are available and enable predicting with the fitted model to conditions (locations) for which no local measurements are available. But how reliable are such predictions? And what determines the reliability of predictions to unobserved locations? How can this reliability, the prediction error, be estimated?

With digital soil mapping introduced in Block 3, here we probe its fundamental modelling paradigm - spatial upscaling. We learn how we test a (machine learning) model with a view to what it is used for - the prediction task.

This block serves to critically reflect on working with big data and using (black box) models. It does not introduce entirely new methods, but serves to apply methods learned in previous blocks and in AGDS I for exploring and understanding the benefits and limits of (geo) data science methods. Rather than a tutorial, it comes in the form of literature study and working with the data yourself. All students are required to hand in the Report Exercise of this block.

Session 11 - 30.11.

  • Introduction to model generalisability and spatial upscaling
  • Self-study of literature and report exercise

Session 12 - 07.12.

  • Paper discussion of Ludwig et al. (2023)
  • Self-study of literature and report exercise

Work on Report Exercises

Session 13 - 14.12.

  • Free work on Report Exercises

Session 14 - 21.12.

  • Free work on Report Exercises
  • Wrap-up