052300 VU Foundations of Data Analysis (2018S)
- Registration is open from Mo 12.02.2018 09:00 to Tu 20.02.2018 23:59
- Deregistration possible until Su 18.03.2018 23:59
- Claudia Plant
- Sahar Behzadi Soheil
- Marcus Hudec
- Torsten Möller
- Benjamin Schelling
- Thomas Torsney-Weir
Classes (iCal) - next class is marked with N
Hörsaal 31 Hauptgebäude, 1.Stock, Stiege 9
Hörsaal 50 Hauptgebäude, 2.Stock, Stiege 8
Aims, contents and method of the course
Assessment and permitted materials
- 2 pen-and-paper exercise sheets. They serve as a preparation for the tests. For each exercise sheet you will be able to get a maximum of 5% of the required points.
- 2 exams, one mid-term where you can obtain up to 20% of the total points and one final with questions on the entire course where you can obtain up to 30%.
Furthermore you can complete:
- 1 exercise sheet to assess your current mathematical (prerequisite) knowledge.
- 3 anonymized feedbacks (for a maximum of 3 feedbacks i.e. 1% for each feedback) These feedbacks can either be returned to the Tutor responsible for the lecture in an anonymized manner.
Minimum requirements and assessment criteria
- Programmierung 2 (PR2)
- Mathematische Grundlagen der Informatik 2 (MG2)
- Theoretische Informatik (THI)
- Modellierung (MOD)
- Algorithmen und Datenstrukturen (ADS)Grading will be done according to the following scheme:
1 – at least 87.5%
2 – at least 75.0%
3 - at least 60.0%
4 – at least 40.0%Please keep in mind that in order to pass the course, you will need at least 30% of the total score in each of the 3 parts of the course.In order to successfully pass the course, regular attendance is strongly recommended, however not mandatory.
1.1. Fundamental Concepts in Inference
1.2. Parametric Inference
1.3. Hypothesis Testing and p-values
1.4. The Bootstrap
1.5. Data Splitting, Cross-Validation
2. Regression Modelling
2.1. Simple Linear Regression
2.2. Multiple Regression
2.3. Further Regression Methods
2.4. Generalized Linear Models
2.5. Regression Trees
3. Classification Modelling
3.1. Decision Theoretic Introduction; Error rates, and Bayes Optimality
3.2. Logistic Regression
3.3. Classification Trees
3.4. Support Vector Machines
3.6. Further Classification Methods
4. Neural Networks
5. Basic Techniques of Unsupervised Learning
5.1. Dimension Reduction (Matrix Factorization)
5.2. Association Rules
6. Clustering Methods
6.1. Hierarchical Clustering
6.2. Model-based Clustering
6.3. Evaluation and Validation of Clustering Results
6.4. Density-based Clustering
6.5. Self Organizing Maps
> Han, Kamber: Data Mining: Concepts and Techniques, Elsevier 2012.
> Hastie-Tibshirani-Friedman: The Elements of Statistical Learning, Springer 2009.
> James-Witten-Hastie-Tibshirani: An Introduction to Statistical Learning with Applications in R, Springer 2015.