Universität Wien

040025 UK Large-Scale Inference (2018S)

8.00 ECTS (4.00 SWS), SPL 4 - Wirtschaftswissenschaften
Continuous assessment of course work

Registration/Deregistration

Note: The time of your registration within the registration period has no effect on the allocation of places (no first come, first served).

Details

max. 30 participants
Language: English

Lecturers

Classes (iCal) - next class is marked with N

  • Thursday 01.03. 13:15 - 14:45 Seminarraum 3 Oskar-Morgenstern-Platz 1 1.Stock
  • Tuesday 06.03. 11:30 - 13:00 Seminarraum 3 Oskar-Morgenstern-Platz 1 1.Stock
  • Thursday 08.03. 13:15 - 14:45 Seminarraum 3 Oskar-Morgenstern-Platz 1 1.Stock
  • Tuesday 13.03. 11:30 - 13:00 Seminarraum 3 Oskar-Morgenstern-Platz 1 1.Stock
  • Thursday 15.03. 13:15 - 14:45 Seminarraum 3 Oskar-Morgenstern-Platz 1 1.Stock
  • Tuesday 20.03. 11:30 - 13:00 Seminarraum 3 Oskar-Morgenstern-Platz 1 1.Stock
  • Thursday 22.03. 13:15 - 14:45 Seminarraum 3 Oskar-Morgenstern-Platz 1 1.Stock
  • Tuesday 10.04. 11:30 - 13:00 Seminarraum 3 Oskar-Morgenstern-Platz 1 1.Stock
  • Thursday 12.04. 13:15 - 14:45 Seminarraum 3 Oskar-Morgenstern-Platz 1 1.Stock
  • Tuesday 17.04. 11:30 - 13:00 Seminarraum 3 Oskar-Morgenstern-Platz 1 1.Stock
  • Thursday 19.04. 13:15 - 14:45 Seminarraum 3 Oskar-Morgenstern-Platz 1 1.Stock
  • Tuesday 24.04. 11:30 - 13:00 Seminarraum 3 Oskar-Morgenstern-Platz 1 1.Stock
  • Thursday 26.04. 13:15 - 14:45 Seminarraum 3 Oskar-Morgenstern-Platz 1 1.Stock
  • Wednesday 02.05. 15:00 - 16:30 Seminarraum 6 Oskar-Morgenstern-Platz 1 1.Stock
  • Thursday 03.05. 13:15 - 14:45 Seminarraum 3 Oskar-Morgenstern-Platz 1 1.Stock
  • Tuesday 08.05. 11:30 - 13:00 Seminarraum 3 Oskar-Morgenstern-Platz 1 1.Stock
  • Tuesday 15.05. 11:30 - 13:00 Seminarraum 3 Oskar-Morgenstern-Platz 1 1.Stock
  • Thursday 17.05. 13:15 - 14:45 Seminarraum 3 Oskar-Morgenstern-Platz 1 1.Stock
  • Wednesday 23.05. 15:00 - 16:30 Seminarraum 3 Oskar-Morgenstern-Platz 1 1.Stock
  • Thursday 24.05. 13:15 - 14:45 Seminarraum 3 Oskar-Morgenstern-Platz 1 1.Stock
  • Tuesday 29.05. 11:30 - 13:00 Seminarraum 3 Oskar-Morgenstern-Platz 1 1.Stock
  • Tuesday 05.06. 11:30 - 13:00 Seminarraum 3 Oskar-Morgenstern-Platz 1 1.Stock
  • Thursday 07.06. 13:15 - 14:45 Seminarraum 3 Oskar-Morgenstern-Platz 1 1.Stock
  • Tuesday 12.06. 11:30 - 13:00 Seminarraum 3 Oskar-Morgenstern-Platz 1 1.Stock
  • Thursday 14.06. 13:15 - 14:45 Seminarraum 3 Oskar-Morgenstern-Platz 1 1.Stock
  • Tuesday 19.06. 11:30 - 13:00 Seminarraum 3 Oskar-Morgenstern-Platz 1 1.Stock
  • Thursday 21.06. 13:15 - 14:45 Seminarraum 3 Oskar-Morgenstern-Platz 1 1.Stock
  • Tuesday 26.06. 11:30 - 13:00 Seminarraum 3 Oskar-Morgenstern-Platz 1 1.Stock
  • Thursday 28.06. 13:15 - 14:45 Seminarraum 3 Oskar-Morgenstern-Platz 1 1.Stock

Information

Aims, contents and method of the course

The term ``big data" is bandied about so frequently that it has lost perhaps any relevant meaning, instead merely referring to a vaguely connected set of ideas related to ``modern data sets." While data set size does pose computational problems, our goal will be understanding some of the corresponding changes to statistical methodology. This transition is summarized best in the Prologue of Brad Efron's book, Large Scale Inference, and was the inspiration for the title of this course:

``At the risk of drastic oversimplification, the history of statistics as a recognized discipline can be divided into three eras:
1. The age of Quetelet and his successors, in which huge census-level data sets were brought to bear on simple but important questions: Are there more male than female births? Is the rate of insanity rising?
2. The classical period of Pearson, Fisher, Neyman, Hotelling, and their successors, intellectual giants who developed a theory of optimal inference capable of wringing every drop of information out of a scientific experiment. The questions dealt with still tended to be simple—Is treatment A better than treatment B? — but the new methods were suited to the kinds of small data sets individual scientists might collect.
3. The era of scientific mass production, in which new technologies typified by the microarray allow a single team of scientists to produce data sets of a size Quetelet would envy. But now the flood of data is accompanied by a deluge of questions, perhaps thousands of estimates or hypothesis tests that the statistician is charged with answering together; not at all what the classical masters had in mind."

Clearly we will be addressing section 3. Topics covered include:
- Testing problems in high dimensions: sparse and nonsparse alternatives.
- Multiple testing problems: familywise error rate (FWER), closure-principle, procedures for controlling FWER, false discovery rate (FDR), procedures for controlling FDR, empirical Bayes interpretation of FDR.
- Model selection in high dimensions: thresholding rules, Lasso, Dantzig.
- Post-selection inference: POSI, Selective inference, Knockoffs, multiple comparisons.
- James-Stein estimation, Stein's unbiased risk estimate, empirical Bayes view of James-Stein Prediction error.

Assessment and permitted materials

Homework, Final, Project, Participation. Of the 4, the project will be given the highest weight. The final will be largely conceptual. Subject to change.

Minimum requirements and assessment criteria

In preparation for the course, I recommend revising the following chapters from Keener 2010, Theoretical Statistics: Topics for a Core Course; 1-4, 6-8, 12, 14. You may skip the optional sections. If a section is not review, please let me know.

Examination topics

Reading list


Association in the course directory

Last modified: Mo 07.09.2020 15:28