Universität Wien

200083 SE Vertiefungsseminar: Geist und Gehirn (2023S)

Introduction to Computational Social Sciences

4.00 ECTS (2.00 SWS), SPL 20 - Psychologie
Prüfungsimmanente Lehrveranstaltung

Vertiefungsseminare können nur fürs Pflichtmodul B verwendet werden! Eine Verwendung fürs Modul A4 Freie Fächer ist nicht möglich.

An/Abmeldung

Hinweis: Ihr Anmeldezeitpunkt innerhalb der Frist hat keine Auswirkungen auf die Platzvergabe (kein "first come, first served").

Details

max. 20 Teilnehmer*innen
Sprache: Englisch

Lehrende

Termine (iCal) - nächster Termin ist mit N markiert

Week 1 - Introduction to digital humanities. Brief introduction to computational social sciences; description of the general set of tools and datasets.
Week 2 - Bag of words approach. Refresher of Python syntax and objects; How to create a bag of words using dictionary tools like WordNet.
Week 3 - Preprocessing in NLP and Frequency analysis. Why preprocessing, general steps; How to extract the most frequent words after preprocessing; How to extract the most frequent words in a text;
Week - Introduction to datasets with Pandas. Iterate over cells and texts; Generate clouds of words using WordCloud and average frequency plots.
Week 5 - Sentiment analysis. Overview of techniques used to implement sentiment analysis; Implementation examples.
Week 6 - Introduction to Web scraping; How to read and extract information from a website; Using BeautifulSoup to read HTML and XML files.
Week 7 - Iteratively collect information from a website using loops; Implementation of the sentiment analysis pipeline to online information.
Week 8 - Web scraping using API. What are APIs. How to use APIs to extract information from the internet (e.g. IMDB, Twitter); Implementation of the sentiment analysis pipeline to API information.
Week 9 - Culture as fossilized psychology. The challenge of analyzing historical texts. How to use psychometric tools to build valid bags-of-words; plot historical sentiment time series using Seaborn.
Week 10 - Handling Socioeconomical data. Surveying the major datasets for contemporary and historical socioeconomic estimates; Understanding the meaning and differences between socioeconomic variables and how they are affected by historical events; Extracting socio-economic data from datasets and adding them to a dataset of textual word frequencies.
Week 11 - Testing hypotheses on the relationship between historic events, socioeconomics, and word-frequencies. Quasi-experimental methods, time series trends, linear mixed models; Linear mixed model trend comparison, model selection with stepAIC; Implementation with R.
Week 12 - Practical session - Students work on their projects in the classroom.
Week 13 - Project presentations and Q&A
Week 14 - Practical session - Students work on their projects in the classroom.

  • Freitag 03.03. 09:45 - 11:15 Hörsaal A Psychologie, NIG 6.Stock A0606
  • Freitag 10.03. 09:45 - 11:15 Hörsaal A Psychologie, NIG 6.Stock A0606
  • Freitag 17.03. 09:45 - 11:15 Hörsaal A Psychologie, NIG 6.Stock A0606
  • Freitag 24.03. 09:45 - 11:15 Hörsaal A Psychologie, NIG 6.Stock A0606
  • Freitag 31.03. 09:45 - 11:15 Hörsaal A Psychologie, NIG 6.Stock A0606
  • Freitag 21.04. 09:45 - 11:15 Hörsaal A Psychologie, NIG 6.Stock A0606
  • Freitag 28.04. 09:45 - 11:15 Hörsaal A Psychologie, NIG 6.Stock A0606
  • Freitag 05.05. 09:45 - 11:15 Hörsaal A Psychologie, NIG 6.Stock A0606
  • Freitag 12.05. 09:45 - 11:15 Hörsaal A Psychologie, NIG 6.Stock A0606
  • Freitag 19.05. 09:45 - 11:15 Hörsaal A Psychologie, NIG 6.Stock A0606
  • Freitag 26.05. 09:45 - 11:15 Hörsaal A Psychologie, NIG 6.Stock A0606
  • Freitag 02.06. 09:45 - 11:15 Hörsaal A Psychologie, NIG 6.Stock A0606
  • Freitag 09.06. 09:45 - 11:15 Hörsaal A Psychologie, NIG 6.Stock A0606
  • Freitag 16.06. 09:45 - 11:15 Hörsaal A Psychologie, NIG 6.Stock A0606
  • Freitag 23.06. 09:45 - 11:15 Hörsaal A Psychologie, NIG 6.Stock A0606
  • Freitag 30.06. 09:45 - 11:15 Hörsaal A Psychologie, NIG 6.Stock A0606

Information

Ziele, Inhalte und Methode der Lehrveranstaltung

In this course students will learn how to use natural language processing tools to automatically analyze the content of large numbers of texts. After this course students will be able to perform sentiment analyses of modern textual sources from the internet but also of historical sources such as fiction work (movies and books) from the past. Students will be able to quantify the temporal change of values and preferences expressed in these textual sources and its relationship to historical events and socio-economic dynamics. BASIC KNOWLEDGE OF PYTHON AND R are highly recommended.

Contents

Week 1 - Introduction to digital humanities. Brief introduction to computational social sciences; description of the general set of tools and datasets.

Week 2 - refresh some simple concepts from Python, such as variables, numbers, strings, lists, loops, modules and functions

Week 3 - Introduction to modules. Examples with NLTK, Sentiment Analysis (Vader), Matplotlib, Fixing Contractions and Inflections, Regular Expressions

Week 4 - Introduction to bag of words analysis. Using WordNet to find synonyms and hyponyms. Preprocessing text.

Week 5 – Introduction to pandas; Creating datasets and wordclouds.

Week 6 - Sentiment analysis. Overview of techniques used to implement sentiment analysis; Implementation examples.

Week 7 - Introduction to Web scraping; How to read and extract information from a website; Using BeautifulSoup to read HTML and XML files.

Week 8 – Web scraping using Selenium; How to interact with websites using python; Implementation of the sentiment analysis pipeline to online information.

Week 9 - Web scraping using API. What are APIs. How to use APIs to extract information from the internet (e.g. IMDB, Twitter); Implementation of the sentiment analysis pipeline to API information.

Week 10 - Culture as fossilized psychology. How to use psychometric tools to build valid bags-of-words; plot historical sentiment time series using Seaborn.

Week 11 – Testing hypotheses on the relationship between historic events, socioeconomics, and word-frequencies. Cross Correlations and linear mixed models

Week 12 – Practical session: testing the historical psychology procedures. Q&A about the scripts and about the final project (assignment 4)

Week 13 - Practical session: Students will brainstorm together about projects and their implementation. The final projects are individual, but the discussion will be done in groups of two.

Week 14 – Project proposal presentations and Oral examination (study materials are the slides and scripts - but only the theoretical - not the implementational content).

Art der Leistungskontrolle und erlaubte Hilfsmittel

20% Class Attendance and Participation
40% Final Project Oral Presentation (Intro and Methods) and Q&A (Individual)
40% Writing up a Project in Article Style (Introduction, Methods, Results and Discussion) (Individual)

Mindestanforderungen und Beurteilungsmaßstab

Students must at least demonstrate conceptual knowledge and understanding of the tools and processes used in computational social sciences. They must be able to conceptualize a project using these tools and present this conceptualization in an oral presentation. Grades improve if students are also able to implement these projects using provided (or custom) scripts in Python and R programming languages, or other computational social sciences tools.

Prüfungsstoff

Learning Objectives:
1. Automatically scrape large numbers of texts from the internet
2. Use natural language processing tools to perform automatic text analysis, including sentiment analysis.
3. Quantify the change of values and preferences across time, its relationship with historical events and with socio-economic conditions using basic econometrics analysis tools.

Learning activities
1. Lectures: students engage in discussions about the readings and the theory provided by the professors.
2. Practical work: students will aim to design experiments individually (this will also include training students in R and Python).
3. Student Activities (Individual/group): students develop their final projects and work on their assignment.

Literatur

Harvard python course:
- https://www.youtube.com/watch?v=nLRL_NcnK-4&t=7492s

Computational social science courses:
- https://ayoubbagheri.nl/applied_tm/
- https://github.com/JanaLasser/SICSS-aachen-graz
- http://digitalmedia.andreasjungherr.de/docs/intro.html
- Engel, U., Quan-Haase, A., Liu, S., & Lyberg, L. (Eds.). (2021). Handbook of Computational Social Science, Volume 2: Data Science, Statistical Modelling, and Machine Learning Methods (1st ed.). Routledge. https://doi.org/10.4324/9781003025245

Historical Psychology:
- Martins & Baumard (2022). How to develop reliable instruments to measure the cultural evolution of preferences and feelings in history? Frontiers in Psychology (13) https://osf.io/acukm/


Zuordnung im Vorlesungsverzeichnis

Letzte Änderung: Do 24.08.2023 11:07