136009 AR Applied Data Science for Linguists (2023S)
Continuous assessment of course work
Labels
Registration/Deregistration
Note: The time of your registration within the registration period has no effect on the allocation of places (no first come, first served).
- Registration is open from Mo 06.02.2023 08:00 to Mo 27.02.2023 08:00
- Deregistration possible until Fr 31.03.2023 23:59
Details
max. 25 participants
Language: English
Lecturers
Classes (iCal) - next class is marked with N
- Monday 20.03. 11:30 - 13:00 Hörsaal 2 Hauptgebäude, Tiefparterre Stiege 5 Hof 3
- Monday 27.03. 11:30 - 16:00 Hörsaal 2 Hauptgebäude, Tiefparterre Stiege 5 Hof 3
- Monday 17.04. 11:30 - 16:00 Hörsaal 2 Hauptgebäude, Tiefparterre Stiege 5 Hof 3
- Monday 08.05. 11:30 - 16:00 Hörsaal 2 Hauptgebäude, Tiefparterre Stiege 5 Hof 3
- Monday 05.06. 11:30 - 16:00 Hörsaal 2 Hauptgebäude, Tiefparterre Stiege 5 Hof 3
Information
Aims, contents and method of the course
Assessment and permitted materials
Pre-course exercise (to be handed in online), four home assignments (R and RStudio on your own computer), and participation in class
Minimum requirements and assessment criteria
The ability to analyze linguistic data as well as the ability to understand and interpret statistical analyses, to fit statistical models and to use these models for making predictions. The ability to use R and RStudio for this purpose.Assessment:
Pre-course exercise: 5%
Four home assignments: 20% each
Participation in class: 15%
Minimum pass grade: 60% in total
Pre-course exercise: 5%
Four home assignments: 20% each
Participation in class: 15%
Minimum pass grade: 60% in total
Examination topics
Reading list
Baayen, R. H. (2008) Analyzing linguistic data: A practical introduction to statistics using R. Cambridge: Cambridge University Press.
Butler, C. (1985). Statistics in linguistics. Oxford: Blackwell.
Donoho, D. (2017). 50 years of data science. Journal of Computational and Graphical Statistics, 26(4), 745-766.
Hirschberg, J., & Manning, C. D. (2015). Advances in natural language processing. Science, 349(6245), 261-266.
Feinerer, I. (2018). Introduction to the tm Package Text Mining in R. http://cran.uib.no/web/packages/tm/vignettes/tm.pdf
Butler, C. (1985). Statistics in linguistics. Oxford: Blackwell.
Donoho, D. (2017). 50 years of data science. Journal of Computational and Graphical Statistics, 26(4), 745-766.
Hirschberg, J., & Manning, C. D. (2015). Advances in natural language processing. Science, 349(6245), 261-266.
Feinerer, I. (2018). Introduction to the tm Package Text Mining in R. http://cran.uib.no/web/packages/tm/vignettes/tm.pdf
Association in the course directory
EC DH 2
Last modified: Th 04.07.2024 00:13
In this course, we will make use of the scripting language R together with its frontend RStudio. Both are pre-installed on the computers in the lab, but you might want to install them on your own computers as well (e.g. for doing the exercises at home). You will learn how to use R as we go along. Further instructions and literature will be provided on Moodle.This is an introductory course. As such, no previous knowledge of statistics, statistical software, machine learning or programming is required, but a solid knowledge of high school mathematics (at least Unterstufe) will prove useful (linear functions, basic arithmetic operations, fractions, percentages, probability etc.). Since this course is aimed at a linguistically trained audience, I will take knowledge of fundamental linguistic concepts for granted.