Universität Wien

220050 SE SE Advanced Data Analysis 2 (2024S)

Continuous assessment of course work

Registration/Deregistration

Note: The time of your registration within the registration period has no effect on the allocation of places (no first come, first served).

Details

max. 30 participants
Language: English

Lecturers

Classes (iCal) - next class is marked with N

We will meet on Zoom for the first session.

  • Friday 01.03. 15:00 - 18:00 Seminarraum 6, Kolingasse 14-16, EG00
  • Friday 15.03. 15:00 - 18:00 Seminarraum 6, Kolingasse 14-16, EG00
  • Friday 22.03. 15:00 - 18:00 Seminarraum 6, Kolingasse 14-16, EG00
  • Friday 19.04. 15:00 - 18:00 Seminarraum 6, Kolingasse 14-16, EG00
  • Friday 03.05. 15:00 - 18:00 Seminarraum 6, Kolingasse 14-16, EG00
  • Friday 28.06. 15:00 - 18:00 Seminarraum 6, Kolingasse 14-16, EG00

Information

Aims, contents and method of the course

Facing the massive volumes of text data that are available in digital format and valuing their potential, over recent years communication scientists have increasingly turned to methods that rely on the support of computer power, so- called automated text analysis methods. The text-as-data methods are used to draw reproducible and valid inferences or meanings from documents. As an enhancement of the more classical manual methods of content analysis, automated methods of text analysis are becoming prevalent in disciplines that are overall increasingly computationally oriented.

This course provides an introduction to various text as data methods. It includes aspects related to data collection, data processing, quality control, and the critical interpretation of results. A mix of interactive lectures and guided coding sessions will enable you to implement your first text as data research designs.

Topics covered:

• Motivations and applications for using text as data methods
• Data collection and data selection
• Feature Selection
• Text Representation
• Rule-based classification
• Supervised classification
• Unsupervised approaches
• Intro advanced methods (e.g., Transformers, Embeddings, Multilingual text analysis)

All topics are introduced with a lecture type approach and then illustrated with practical examples. The lecture part consists of input by the instructor (i.e., covering the basics of each topic, highlighting latest methods research, and introducing resources) and shorter interactive parts (i.e., reflection and discussions on different methods in plenary and small groups). The practical part consists of guided coding sessions, where we work together through prepared code.

By the end of this course participants will be able to:

• make an informed decision about a suitable method for a given application scenario.
• practically apply various basic text as data methods.
• critically evaluate results.

Assessment and permitted materials

Throughout the semester, participants will engage in two coding challenges designed to reinforce their comprehension of topics covered in class and practiced during guided coding sessions.
In addition to the coding challenges, the course evaluation will be based on a group project presentation and an accompanying written report. This project involves the practical application of learned techniques to analyze a provided text dataset through secondary data analysis. Detailed instructions and supplementary information will be shared during class sessions to ensure students are well-equipped for the assignment.
Course Evaluation Breakdown:
1. Two Coding Challenges (40%)
2. Group Presentation of Data Analysis Project (30%)
3. Written Report of Data Analysis Project (30%)

Minimum requirements and assessment criteria

• Engage actively in class discussions and activities.
• Maintain a minimum attendance of 75% throughout the course.
• Proficiency in the programming language R and the utilization of RStudio are fundamental prerequisites for participation in this course. Ensure you possess basic practical knowledge in these tools.
• Dedicate time to independent study and exercises to successfully achieve the course objectives.
• Kindly bring your personal laptop to each class session, and ensure that both R and RStudio are installed before the commencement of the first class. This is a mandatory requirement for seamless participation in the course.

Examination topics

Examination topics consist of the content of the learning units.

Required knowledge and practical skills will be conveyed during the lectures.

The slides used during the lectures will be shared on Moodle.

Additional readings will also be shared on Moodle.

Reading list

Resources to refresh your R skills before the course:
R for Data Science https://r4ds.hadley.nz/
Tidy Modeling with R. https://www.tmwr.org/
Intro text to read before the first class:
Grimmer, J., & Stewart, B. M. (2013). Text as data: The promise and pitfalls of automatic content analysis methods for political texts. Political analysis, 21(3), 267-297.

Association in the course directory

Last modified: Mo 05.02.2024 10:46