Universität Wien

136102 UE Statistics and Machine Learning for (Computational) Linguists (2025S)

Continuous assessment of course work

Registration/Deregistration

Note: The time of your registration within the registration period has no effect on the allocation of places (no first come, first served).

Details

max. 25 participants
Language: English

Lecturers

Classes (iCal) - next class is marked with N

  • Friday 07.03. 09:45 - 11:15 Seminarraum 6 Hauptgebäude, Tiefparterre Stiege 9 Hof 5
  • Friday 14.03. 09:45 - 11:15 Seminarraum 6 Hauptgebäude, Tiefparterre Stiege 9 Hof 5
  • Friday 21.03. 09:45 - 11:15 Seminarraum 6 Hauptgebäude, Tiefparterre Stiege 9 Hof 5
  • Friday 28.03. 09:45 - 11:15 Seminarraum 6 Hauptgebäude, Tiefparterre Stiege 9 Hof 5
  • Friday 04.04. 09:45 - 11:15 Seminarraum 6 Hauptgebäude, Tiefparterre Stiege 9 Hof 5
  • Friday 11.04. 09:45 - 11:15 Seminarraum 6 Hauptgebäude, Tiefparterre Stiege 9 Hof 5
  • Friday 09.05. 09:45 - 11:15 Seminarraum 6 Hauptgebäude, Tiefparterre Stiege 9 Hof 5
  • Friday 16.05. 09:45 - 11:15 Seminarraum 6 Hauptgebäude, Tiefparterre Stiege 9 Hof 5
  • Friday 23.05. 09:45 - 11:15 Seminarraum 6 Hauptgebäude, Tiefparterre Stiege 9 Hof 5
  • Friday 30.05. 09:45 - 11:15 Seminarraum 6 Hauptgebäude, Tiefparterre Stiege 9 Hof 5
  • Friday 06.06. 09:45 - 11:15 Seminarraum 6 Hauptgebäude, Tiefparterre Stiege 9 Hof 5
  • Friday 13.06. 09:45 - 11:15 Seminarraum 6 Hauptgebäude, Tiefparterre Stiege 9 Hof 5
  • Friday 20.06. 09:45 - 11:15 Seminarraum 6 Hauptgebäude, Tiefparterre Stiege 9 Hof 5
  • Friday 27.06. 09:45 - 11:15 Seminarraum 6 Hauptgebäude, Tiefparterre Stiege 9 Hof 5

Information

Aims, contents and method of the course

This course gives a basic introduction to statistical methods for (computational) linguists.

Requirements:
• Computer literacy (e.g. Computational Background Skills for Digital Humanities (EC))
• Introduction to DH Tools and Methods (Skills I)
• Data Structures and Data Management in the Humanities (Skills I)

The contents covered in the course are:

Descriptive Statistics
• Levels of measure
• Measures of central tendencies
• Normal distribution
• Correlation (both Pearson and Spearman)
• Interreliability (Cohen’s/Fleiss Kappa)
Inferential Statistics
• Concept of statistical significance testing
Basic probability theory
• Naïve Bayes
• Pointwise Mutual Information
Machine Learning
• Introduction of the concept of supervised machine learning
• Overfitting
• Logistic Regression
• Feature Engineering
• Vector space models/word embeddings

Assessment and permitted materials

Course evaluation will consist of a combination of in-class participation (20%) and homework assignments (80%).

There are 3 types of exercises in this course:
• theoretical questions
• pen-and-paper calculation exercises
• programming tasks (Python!)

Minimum requirements and assessment criteria

Attendance is required; regular participation is the key to completing the course; all students must provide their computing environment; homework assignments must be submitted on time.

Examination topics

There is no examination for the course.

Reading list

Christopher Butler: Statistics in Linguistics, 1985.

Association in the course directory

DH-S II
S-DH Cluster I: Language and Literature

Last modified: We 29.01.2025 14:06