Universität Wien

052321 VU Recent Developments in Knowledge Discovery in Databases (2025S)

Continuous assessment of course work

Registration/Deregistration

Note: The time of your registration within the registration period has no effect on the allocation of places (no first come, first served).

Details

max. 25 participants
Language: English

Lecturers

Classes (iCal) - next class is marked with N

  • Tuesday 04.03. 15:00 - 16:30 Seminarraum 8, Währinger Straße 29 1.OG
  • Thursday 06.03. 13:15 - 14:45 Seminarraum 5, Währinger Straße 29 1.UG
  • Tuesday 11.03. 15:00 - 16:30 Seminarraum 8, Währinger Straße 29 1.OG
  • Thursday 13.03. 13:15 - 14:45 Seminarraum 5, Währinger Straße 29 1.UG
  • Tuesday 18.03. 15:00 - 16:30 Seminarraum 8, Währinger Straße 29 1.OG
  • Thursday 20.03. 13:15 - 14:45 Seminarraum 5, Währinger Straße 29 1.UG
  • Tuesday 25.03. 15:00 - 16:30 Seminarraum 8, Währinger Straße 29 1.OG
  • Thursday 27.03. 13:15 - 14:45 Seminarraum 5, Währinger Straße 29 1.UG
  • Tuesday 01.04. 15:00 - 16:30 Seminarraum 8, Währinger Straße 29 1.OG
  • Thursday 03.04. 13:15 - 14:45 Seminarraum 5, Währinger Straße 29 1.UG
  • Tuesday 08.04. 15:00 - 16:30 Seminarraum 8, Währinger Straße 29 1.OG
  • Thursday 10.04. 13:15 - 14:45 Seminarraum 5, Währinger Straße 29 1.UG
  • Tuesday 29.04. 15:00 - 16:30 Seminarraum 8, Währinger Straße 29 1.OG
  • Tuesday 06.05. 15:00 - 16:30 Seminarraum 8, Währinger Straße 29 1.OG
  • Thursday 08.05. 13:15 - 14:45 Seminarraum 5, Währinger Straße 29 1.UG
  • Tuesday 13.05. 15:00 - 16:30 Seminarraum 8, Währinger Straße 29 1.OG
  • Tuesday 20.05. 15:00 - 16:30 Seminarraum 8, Währinger Straße 29 1.OG
  • Thursday 22.05. 13:15 - 14:45 Seminarraum 5, Währinger Straße 29 1.UG
  • Tuesday 27.05. 15:00 - 16:30 Seminarraum 8, Währinger Straße 29 1.OG
  • Tuesday 03.06. 15:00 - 16:30 Seminarraum 8, Währinger Straße 29 1.OG
  • Thursday 05.06. 13:15 - 14:45 Seminarraum 5, Währinger Straße 29 1.UG
  • Tuesday 10.06. 15:00 - 16:30 Seminarraum 8, Währinger Straße 29 1.OG
  • Thursday 12.06. 13:15 - 14:45 Seminarraum 5, Währinger Straße 29 1.UG
  • Tuesday 17.06. 15:00 - 16:30 Seminarraum 8, Währinger Straße 29 1.OG
  • Tuesday 24.06. 15:00 - 16:30 Seminarraum 8, Währinger Straße 29 1.OG
  • Thursday 26.06. 13:15 - 14:45 Seminarraum 5, Währinger Straße 29 1.UG

Information

Aims, contents and method of the course

The amount of data gathered every year is steadily increasing, and mankind is curious for more and more knowledge that they can discover in these vast amounts of data. Even though data is a mighty resource, we are not using its full potential:

This course aims at teaching recently developed, important state-of-the-art methods to discover knowledge in databases– from a theoretical point of view as well as their implementation and application. We learn how to discover, assess, and deeply understand novel methods that are more complex than fundamental methods taught in other courses. We address different aspects of learning new methods from the field of knowledge discovery in databases: learning lecture-style by listening to talks, creating a small data base for benchmarking, discovering a new method by reading a scientific paper and teaching it to others in a talk as well as discussing it in groups and preparing a small tutorial to explain it for laymen.

This semester, we focus on data-driven causality and clustering methods. Both, causality and clustering are well-represented in top AI international conferences, as AAAI Conference on Artificial Intelligence, or IEEE International Conference on Data Mining, or ICLR.

Methods/ Course:

The course will have two parts: causality and clustering.

In Causality, recent methods for causal discovery in both temporal and non-temporal data will be presented. Concretely Granger causal methods and their more recent variations. After this part, bivariate causality in non-temporal data will be presented including the benchmark data sets and methods. In the second part of the course, the students will be assigned a causal challenge project, in which they create their small data base for benchmarking on real-world causal inference problems. Moreover, the students will solve an exercise sheet and present a paper on causality which they select in the beginning of the course.
The goal of this course is by active learning to understand und be creative in this awesome field of knowledge discovery.

In the second part, we focus on clustering. We build upon existing knowledge from FDA and Data Mining and regard recent developments in the field and approaches to open challenges like fairness, noisy data sets, or data with uncertainty.
As a project, students can choose between more theoretical or practical work:
For the theory project, they focus on a recent paper, create a tutorial for it that makes it easy to understand for non-computer scientists, and present it to the group.
If you prefer a more practical project, we give the option to take part in a challenge like the KDD CUP (which is going to be published on March 1st, as a reference, you can regard challenges from the last year: https://www.biendata.xyz/kdd2024/)
The semester ends with a small test about the topics from the second half of the semester.

Assessment and permitted materials

100 points in total.
Causality: a small test at the end of the Causality course; Exercise sheet; Paper presentation; Causal challenge (= creating a small database).

Clustering: either theory or practical project (25P); Test in the end (25P).

Minimum requirements and assessment criteria

This course is for master students only.

We recommend to have visited the basic bachelor courses as well as
- Foundations of Data Analysis (required)
- Data Mining

Components:
50% from the Causality part
25% Project for clustering
25% Test about clustering

Grading:
>87,00 %: 1
between 75,00 % and 86,99 %: 2
between 63,00 % and 74,99 %: 3
between 50,00 % and 62,99 %: 4
< 50%: 5

Examination topics

Reading list

For the Causal Inference part, this literature provides the background to better understand the taught models and methods:

Sayed, Ali H. Inference and Learning from Data: Learning. Vol. 1- 3. Cambridge University Press, 2022.

Volume I: Chapters Matrix Theory, Random Variable, Exponential Distributions, pp. 1-195; Random Processes, pp. 240-259; Volume II: Chapters MSE Inference, pp. 1053-1090, Linear Regression, pp. 1121-1153; Maximum Likelihood, pp. 1211-1273, Inference in Graphs: 1682-1737; Volume III: Chapters Regularization, pp. 2221-2257, Logistic Regression, pp. 2457-2496.

Access to the book via Library of University of Vienna (website) or Cambridge University Press (website).

Association in the course directory

Last modified: Tu 25.03.2025 14:25