Universität Wien

040721 UK Ausgewählte Kapitel der Statistik (2017W)

Statistical Learning and Big Data

3.00 ECTS (2.00 SWS), SPL 4 - Wirtschaftswissenschaften
Prüfungsimmanente Lehrveranstaltung

An/Abmeldung

Hinweis: Ihr Anmeldezeitpunkt innerhalb der Frist hat keine Auswirkungen auf die Platzvergabe (kein "first come, first served").

Details

max. 30 Teilnehmer*innen
Sprache: Englisch

Lehrende

Termine (iCal) - nächster Termin ist mit N markiert

  • Montag 02.10. 16:45 - 18:15 Hörsaal 3 Oskar-Morgenstern-Platz 1 Erdgeschoß
  • Dienstag 03.10. 18:30 - 20:00 Hörsaal 9 Oskar-Morgenstern-Platz 1 1.Stock
  • Mittwoch 04.10. 16:45 - 18:15 Hörsaal 3 Oskar-Morgenstern-Platz 1 Erdgeschoß
  • Donnerstag 05.10. 16:45 - 18:15 Hörsaal 5 Oskar-Morgenstern-Platz 1 Erdgeschoß
  • Freitag 06.10. 16:45 - 18:15 Hörsaal 3 Oskar-Morgenstern-Platz 1 Erdgeschoß
  • Montag 09.10. 13:15 - 14:45 Hörsaal 11 Oskar-Morgenstern-Platz 1 2.Stock
  • Dienstag 10.10. 18:30 - 20:00 Hörsaal 9 Oskar-Morgenstern-Platz 1 1.Stock
  • Mittwoch 11.10. 16:45 - 18:15 Hörsaal 3 Oskar-Morgenstern-Platz 1 Erdgeschoß
  • Donnerstag 12.10. 15:00 - 16:30 Hörsaal 8 Oskar-Morgenstern-Platz 1 1.Stock
  • Donnerstag 12.10. 16:45 - 18:15 Hörsaal 9 Oskar-Morgenstern-Platz 1 1.Stock
  • Freitag 13.10. 16:45 - 18:15 Hörsaal 9 Oskar-Morgenstern-Platz 1 1.Stock
  • Freitag 13.10. 18:30 - 20:00 Hörsaal 8 Oskar-Morgenstern-Platz 1 1.Stock

Information

Ziele, Inhalte und Methode der Lehrveranstaltung

For detailed information about this course, please go to http://www.tau.ac.il/~saharon/StatLearn-Vienna.html

The goal of this course is to gain familiarity with the basic ideas and methodologies of statistical (machine) learning. The focus is on supervised learning and predictive modeling, i.e., fitting y ≈ ∧f(x), in regression and classification.
We will start by thinking about some of the simpler, but still highly effective methods, like nearest neighbors and linear regression, and gradually learn about more complex and "modern" methods and their close relationships with the simpler ones.
As time permits, we will also cover one or more industrial "case studies" where we track the process from problem definition, through development of appropriate methodology and its implementation, to deployment of the solution and examination of its success in practice.
The homework and exam will combine hands-on programming and modeling with theoretical analysis. Topics list (we will cover some of these, as time permits):

- Introduction (text chap. 1,2): Local vs. global modeling; Overview of statistical considerations: Curse of dimensionality, bias-variance tradeoff; Selection of loss functions; Basis expansions and kernels

- Linear methods for regression and their extensions (text chap. 3): Regularization, shrinkage and principal components regression; Quantile regression

- Linear methods for classification (text chap. 4): Linear discriminant analysis; Logistic regression; Linear support vector machines (SVM)

- Classification and regression trees (text chap. 9.2)

- Model assessment and selection (text chap. 7): Bias-variance decomposition; In-sample error estimates, including Cp and BIC; Cross validation; Bootstrap methods

- Basis expansions, regularization and kernel methods (text chap. 5,6): Splines and polynomials; Reproducing kernel Hilbert spaces and non-linear SVM

- Committee methods in embedded spaces (material from chaps 8-10): Random Forest and boosting

- Deep learning and its relation to statistical learning

- Learning with sparsity: Lasso, marginal modeling etc.

- Case studies: Customer wallet estimation; Netflix prize competition; Testing on public databases

Art der Leistungskontrolle und erlaubte Hilfsmittel

The grading will be based on a combination of homework and a final exam. Given the short format of the course, a single homework problem will be given every day after class. The problems will combine theory and applied work in R. Out of the total of about eleven problems that will be given, you will have to solve and submit a subset. This will account for about 30% of the course grade.

70% of the course grade will be based on an in-class exam that will be given at the end of the course.

Mindestanforderungen und Beurteilungsmaßstab

Basic knowledge of mathematical foundations: Calculus; Linear Algebra; Geometry
Undergraduate courses in: Probability; Theoretical Statistics
Statistical programming experience in R is not a prerequisite, but an advantage

Prüfungsstoff

Literatur

Textbook:
Elements of Statistical Learning by Hastie, Tibshirani & Friedman

For more info, see http://www.tau.ac.il/~saharon/StatLearn-Vienna.html

Zuordnung im Vorlesungsverzeichnis

Letzte Änderung: Mo 07.09.2020 15:29