Universität Wien

290008 PS Geographical Predictive Analytics (2022S)

3.00 ECTS (2.00 SWS), SPL 29 - Geographie
Prüfungsimmanente Lehrveranstaltung

An/Abmeldung

Hinweis: Ihr Anmeldezeitpunkt innerhalb der Frist hat keine Auswirkungen auf die Platzvergabe (kein "first come, first served").

Details

max. 25 Teilnehmer*innen
Sprache: Englisch

Lehrende

Termine

Wednesday, starts 9.3.2022, 13:30-15:00, MM-Lab


Information

Ziele, Inhalte und Methode der Lehrveranstaltung

Entry requirements
Knowledge of basic Python scripting is a prerequisite for this course. Knowledge of descriptive statistics or spatial statistics would be advantageous but is not required.

Aims
Geographical predictive analytics involves the creation, testing, and validation of models (i.e. abstractions of reality) for the purpose of drawing conclusions and testing hypotheses in relation to your data as well as making estimations about future and unknown samples. Some of the application domains are soil mapping, ecological mapping, crime forecasting, and geohealth (just to name a few). The types of models that are used in the class are predictive models and descriptive models. Descriptive models are used to generalize/aggregate raw geodata into groups for which extracted information becomes then meaningful and easier to interpret. Furthermore, these models can be utilized for or to develop predictive models. The predictive models, on the other hand, are used to estimate samples’ values at unknown locations and places or to forecast values of future events. The techniques that the course focuses on derive from the spatial statistics and machine learning domains, while it is important to outline the strengths and limitations of both. The geodata applications involve mainly clustering, classification, and regression tasks.

Content
Introduction to the course and its tools, location clustering, attributes clustering, regionalization, spatial regression, geographically weighted regression, decision trees, random forest, K-Nearest Neighbor, modelling workflow, model validation, and performance evaluation metrics.

Methods
The course has a fairly equal amount of both lectures and practical work, in which theory and methods precede their application. The practical work consists of exercises of multiple tasks to be solved/answered given sufficient support material. The practical work will be performed using python (geo)libraries and jupyter notebooks. During the course, students undertake an individual case study that is assisted by the practical work material. Strong emphasis is given on a) the identification of innovative parts of analysis and models, b) the critical reflection of study results, and c) scientific reproducibility.

Art der Leistungskontrolle und erlaubte Hilfsmittel

The assessment is conducted via three testing elements (T1, T2, and T3). The first is an assignment that requires the application of one examination topic on a new and independently chosen dataset. It is assessed via the submission of a reproducibility file and an oral presentation. The second is a peer review that requires the evaluation of a classmate’s reproducibility file. The third test is an active participation in-class activity that takes place during the practical sessions.

• Any ancillary material can be used in preparation or presentation for the tests.
• All tests are individual / no group work.

The weights of each testing element are:
- Assignment 75% (T1)
- Peer review 15% (T2)
- In-class activity 10% (T3)

Mindestanforderungen und Beurteilungsmaßstab

--> Attendance is compulsory on all test dates and for all.
--> All tests are obligatory to pass the course.

-T1 is assessed via the reproducibility file and the oral presentation in class.
- T2 and T3 are assessed in class.
- For T2 and T3 a conditional mark is given for the completion of the tests.

Prüfungsstoff

• Clustering & Regionalization
• Regression with spatial statistical methods
• Regression with machine learning methods
• Classification with machine learning methods
*The examination topics cover the entire content of the course and its learning outcomes.

Literatur

Books & articles:
Anselin, L., & Rey, S. J. (2014). Modern spatial econometrics in practice: A guide to GeoDa, GeoDaSpace and PySAL. GeoDa Press LLC.

Gupta, P. (2019). Data Science with Jupyter: Master Data Science Skills with Easy-to-follow Python Examples.

Jolly, K. (2018). Machine Learning with scikit-learn Quick Start Guide: Classification, regression, and clustering techniques in Python. Packt Publishing Ltd.

Juan C. Duque, Luc Anselin, and Sergio J. Rey. The max-p-regions problem*. Journal of Regional Science, 52(3):397–419, 2012.
Luc Anselin. Spatial Econometrics: Methods and Models. Kluwer, Dordrecht, 1988.

Ran Wei, Sergio Rey, and Elijah Knaap. Efficient regionalization for spatially explicit neighborhood delineation. International Journal of Geographical Information Science, pages 1–17, 2020.

Stewart Fotheringham, Chris Brunsdon, and Martin Charlton. Geographically Weighted Regression: The Analysis of Spatially Varying Relationships. John Wiley & Sons, February 2002. ISBN 978-0-470-85525-6.

Online Textbooks:
GeoDa documentation--> https://geodacenter.github.io/documentation.html

An introduction to machine learning with scikit-learn --> https://scikit-learn.org/stable/tutorial/basic/tutorial.html

Scikit-learn cheat sheet: methods for classification & regression --> https://www.educative.io/blog/scikit-learn-cheat-sheet-classification-regression-methods

Zuordnung im Vorlesungsverzeichnis

(MK2-PI) (MR1-b)

Letzte Änderung: Do 03.03.2022 16:29