Universität Wien

052813 VU Scientific Data Management (2025S)

Prüfungsimmanente Lehrveranstaltung

An/Abmeldung

Hinweis: Ihr Anmeldezeitpunkt innerhalb der Frist hat keine Auswirkungen auf die Platzvergabe (kein "first come, first served").

Details

max. 25 Teilnehmer*innen
Sprache: Englisch

Lehrende

Termine (iCal) - nächster Termin ist mit N markiert

  • Dienstag 04.03. 15:00 - 16:30 Hörsaal 2, Währinger Straße 29 2.OG
  • Freitag 07.03. 15:00 - 16:30 Hörsaal 2, Währinger Straße 29 2.OG
  • Dienstag 11.03. 15:00 - 16:30 Hörsaal 2, Währinger Straße 29 2.OG
  • Freitag 14.03. 15:00 - 16:30 Hörsaal 2, Währinger Straße 29 2.OG
  • Dienstag 18.03. 15:00 - 16:30 Hörsaal 2, Währinger Straße 29 2.OG
  • Freitag 21.03. 15:00 - 16:30 Hörsaal 2, Währinger Straße 29 2.OG
  • Dienstag 25.03. 15:00 - 16:30 Hörsaal 2, Währinger Straße 29 2.OG
  • Freitag 28.03. 15:00 - 16:30 Hörsaal 2, Währinger Straße 29 2.OG
  • Dienstag 01.04. 15:00 - 16:30 Hörsaal 2, Währinger Straße 29 2.OG
  • Freitag 04.04. 15:00 - 16:30 Hörsaal 2, Währinger Straße 29 2.OG
  • Dienstag 08.04. 15:00 - 16:30 Hörsaal 2, Währinger Straße 29 2.OG
  • Freitag 11.04. 15:00 - 16:30 Hörsaal 2, Währinger Straße 29 2.OG
  • Freitag 02.05. 15:00 - 16:30 Hörsaal 2, Währinger Straße 29 2.OG
    Hörsaal 3, Währinger Straße 29 3.OG
  • Dienstag 06.05. 15:00 - 16:30 Hörsaal 2, Währinger Straße 29 2.OG
  • Freitag 09.05. 15:00 - 16:30 Hörsaal 2, Währinger Straße 29 2.OG
  • Dienstag 13.05. 15:00 - 16:30 Hörsaal 2, Währinger Straße 29 2.OG
  • Freitag 16.05. 15:00 - 16:30 Hörsaal 2, Währinger Straße 29 2.OG
  • Dienstag 20.05. 15:00 - 16:30 Hörsaal 2, Währinger Straße 29 2.OG
  • Freitag 23.05. 15:00 - 16:30 Hörsaal 2, Währinger Straße 29 2.OG
  • Dienstag 27.05. 15:00 - 16:30 Hörsaal 2, Währinger Straße 29 2.OG
  • Freitag 30.05. 15:00 - 16:30 Hörsaal 2, Währinger Straße 29 2.OG
  • Dienstag 03.06. 15:00 - 16:30 Hörsaal 2, Währinger Straße 29 2.OG
  • Freitag 06.06. 15:00 - 16:30 Hörsaal 2, Währinger Straße 29 2.OG
  • Dienstag 10.06. 15:00 - 16:30 Hörsaal 2, Währinger Straße 29 2.OG
  • Freitag 13.06. 15:00 - 16:30 Hörsaal 2, Währinger Straße 29 2.OG
  • Dienstag 17.06. 15:00 - 16:30 Hörsaal 2, Währinger Straße 29 2.OG
  • Freitag 20.06. 15:00 - 16:30 Hörsaal 2, Währinger Straße 29 2.OG
  • Dienstag 24.06. 15:00 - 16:30 Hörsaal 2, Währinger Straße 29 2.OG
  • Freitag 27.06. 15:00 - 16:30 Hörsaal 2, Währinger Straße 29 2.OG
    Hörsaal 3, Währinger Straße 29 3.OG

Information

Ziele, Inhalte und Methode der Lehrveranstaltung

This course will be taught in English and take place on-site.

The course introduces central methods for organizing and analyzing large and scientific data, such as distributed data repositories, index data structures, hashing, classification, and clustering techniques. In particular, methods for structured data such as sets, images, text documents, and graphs are discussed.

Exercises and programming assignments complement the lectures. Students will learn ways to realize similarity search and data mining on large data, e.g., using parallelization with MapReduce, Apache Spark, or filter-refinement techniques.

Subject-specific goals:
- Analysis of scientific data
- Interpretation and evaluation of results of the analysis process
- Choosing and applying techniques for structured data
- Implementation of scalable solutions for large amounts of data
- Support and advice of users

Generic goals:
- Teamwork
- Improvement of programming skills
- Understanding of interplay in data mining and scientific computing

It is recommended to complete the following courses before attending:
- Algorithmen und Datenstrukturen
- Datenbanksysteme
- Software Engineering
- Netzwerktechnologien

Art der Leistungskontrolle und erlaubte Hilfsmittel

* Exercises (individual work): you will solve pen and paper exercises at home; to be awarded credits for your solutions, you must present your solutions in the exercise sessions (you will be randomly selected).

* Programming assignments (group work): you will solve graph learning programming assignments at home; you will have to submit your executable source code and a written report describing the results obtained with your implementation; you will have to present your results in in-person sessions.

* Written midterm exam (individual work): you will be allowed to bring a handwritten A4 sheet (2 pages) of notes.

* Written final exam (individual work): you will be allowed to bring a handwritten A4 sheet (2 pages) of notes.

Mindestanforderungen und Beurteilungsmaßstab

The overall grade is composed as follows:
30% Exercises
30% Programming assignments
20% Written midterm exam
20% Written final exam

To successfully complete the course, you must achieve at least 40% of the points in the midterm exam and at least 40% of the points in the final exam.

Attendance of the lecture parts of the course is voluntary but highly recommended. Attendance of the exercise discussions, programming assignment discussions and the written exams is compulsory to obtain points.

Grades will be given according to the following scheme:
100.00 - 87.00: 1
75.00 - 86.99: 2
63.00 - 74.99: 3
50.00 - 62.99: 4
00.00 - 49.99: 5

Prüfungsstoff

All topics covered in class, the exercises, and the programming assignments.

- Scientific Data and Feature Spaces
- Clustering
- Big Data Frameworks
- Searching Numerical Data
- Searching Sets
- Searching & Mining Graphs
- Analyzing Large Networks

Literatur

J. Leskovec, A. Rajaraman, J. Ullman. Mining of Massive Datasets.
J. Han, M. Kamber, J.Pei.Data Mining: Concepts and Techniques.
I. H. Witten , E. Frank, M. A. Hall. Data Mining: Practical Machine Learning Tools and Techniques.

Further literature and references to research papers will be provided via Moodle.

Zuordnung im Vorlesungsverzeichnis

Module: SDM

Letzte Änderung: Fr 04.04.2025 14:05