1 Logistics

1.1 General information

Lead: Prof. Benjamin Stocker; contributing authors and assistance: Dr. Koen Hufkens, Pascal Schneider
Course language is English.
Thursday 8.15-10.00 is our presence time, use it for your communication with us.
No email communications.
Lecture notes are available as online tutorials (see Chapter 2). They contain all contents as reproducible code.
No recordings are made available. It’s all in the tutorials.
Students are expected to master contents taught in 480094 Applied Geodata Science I.
The number of students is limited to 25.
This course follows a hybrid class setup: input lectures, flipped classroom, and individual work on a (partly) self-guided project.
You will work on your own laptops, bring them to class. If this poses a limitation to your participation, please me (B.S.) know.
To get you set up, we’ll support you with installing all the necessary software on our laptops. But please do get started with installations and setup before our first session, following this:

Before Session 1, please install essential software needed for this course is installed and running. We will offer support in class, for those that encounter problems. Essential software is (1) R and RStudio, follow this, (2) NetCDF, follow this, (3) git, follow this. Students who have completed AGDS I will have this installed and are ready to go.

1.2 Course contents and design

After a recap of Applied Geodata Science basics, a set of common applications of machine learning methods in Geography and Environmental Science are introduced. Applications cover a diverse set of data science methods and serve to demonstrate fundamental concepts and implementations, including mechanistic modelling, unsupervised classification, and supervised machine learning. The course combines a theoretical and a hands-on component and consists of a mixture of lectures, exercises, and two (small) self-guided project (‘Report Exercises’).

See Course Plan for a complete overview and contextualisation of the different applications, covered by separate course blocks.

1.3 Evaluation

Students hand in reports of the two Report Exercises in the form of a link, sent to Prof. B. Stocker, to a git repository that contains code for two separate, reproducible workflows (RMarkdown, see here) for the two Report Exercises (vignettes/re1.Rmd and vignettes/re2.Rmd).
One report exercise builds on one of the tutorials of Blocks 2-4 (see Chapter 2). The second report exercise builds on Block 5 (Spatial upscaling).
Each of the two reports will be equally weighted for the final grade.
For grading, we consider the following criteria:
- Reproducibility (25%). Make sure your RMarkdown code is reproducible and generates a html document as intended. You can check this by letting a colleague clone your git repository and run your code on their machine. If it fails, it’s not reproducible, and you won’t get points for this aspect.
- Workspace organisation and code (25%). Follow the principles for workspace organisation and coding as described in the AGDS book (here and here).
- Presentation (25%). This considers the appearance of the generated html document, its text and figures. Provide context, create good data visualisations, and discuss your findings.
- Analysis and modelling (25%). This considers the correctness of analysis and modelling implementations, and the reasonable application of methods.
Each student will hand in their own Report Exercises This is not a group work.
Code duplications are not permitted and duplicated code will be graded with zero points for the respective Report Exercise.
Reports cannot be re-submitted after having received the grades. Make sure all is reproducible before you submit.
Deadline for sending us the link by email: 08.01.2024 (course in fall semester 2023).

1.4 Note of caution

The use of large language models (LLMs), such as ChatGPT, for supporting code development should be considered with caution and is not permitted for writing report text. We are aware that LLMs can help solving technical tasks and the right use of LLMs for solving your problem is not discouraged. However, be aware that code produced by LLMs is not tested to be functional and error-free. Therefore, you have to understand what you get and verify and correct code with a view to your problem. Note that LLMs use text from various sources on the internet. Your report text must not contain verbatim copied text from unidentified sources. This would be considered plagiarism.