136041 SE Topics in Deep Learning and Natural Language Processing (2023S)

6.00 ECTS (2.00 SWS), SPL 13 - Europäische und Vergleichende Sprach- und Literaturwissenschaft

Prüfungsimmanente Lehrveranstaltung

Moodle

An/Abmeldung

Hinweis: Ihr Anmeldezeitpunkt innerhalb der Frist hat keine Auswirkungen auf die Platzvergabe (kein "first come, first served").

Anmeldung von Mo 06.02.2023 08:00 bis Mo 27.02.2023 08:00
Abmeldung bis Fr 31.03.2023 23:59

Details

max. 25 Teilnehmer*innen

Sprache: Englisch

Lehrende

Termine (iCal) - nächster Termin ist mit N markiert

Donnerstag 02.03. 09:45 - 11:15 Seminarraum 6 Hauptgebäude, Tiefparterre Stiege 9 Hof 5
Donnerstag 09.03. 09:45 - 11:15 Seminarraum 6 Hauptgebäude, Tiefparterre Stiege 9 Hof 5
Donnerstag 16.03. 09:45 - 11:15 Seminarraum 6 Hauptgebäude, Tiefparterre Stiege 9 Hof 5
Donnerstag 23.03. 09:45 - 11:15 Seminarraum 6 Hauptgebäude, Tiefparterre Stiege 9 Hof 5
Donnerstag 30.03. 09:45 - 11:15 Seminarraum 6 Hauptgebäude, Tiefparterre Stiege 9 Hof 5
Donnerstag 20.04. 09:45 - 11:15 Seminarraum 6 Hauptgebäude, Tiefparterre Stiege 9 Hof 5
Donnerstag 27.04. 09:45 - 11:15 Seminarraum 6 Hauptgebäude, Tiefparterre Stiege 9 Hof 5
Donnerstag 04.05. 09:45 - 11:15 Seminarraum 6 Hauptgebäude, Tiefparterre Stiege 9 Hof 5
Donnerstag 11.05. 09:45 - 11:15 Seminarraum 6 Hauptgebäude, Tiefparterre Stiege 9 Hof 5
Donnerstag 25.05. 09:45 - 11:15 Seminarraum 6 Hauptgebäude, Tiefparterre Stiege 9 Hof 5
Donnerstag 01.06. 09:45 - 11:15 Seminarraum 6 Hauptgebäude, Tiefparterre Stiege 9 Hof 5
Donnerstag 15.06. 09:45 - 11:15 Seminarraum 6 Hauptgebäude, Tiefparterre Stiege 9 Hof 5
Donnerstag 22.06. 09:45 - 11:15 Seminarraum 6 Hauptgebäude, Tiefparterre Stiege 9 Hof 5
Donnerstag 29.06. 09:45 - 11:15 Seminarraum 6 Hauptgebäude, Tiefparterre Stiege 9 Hof 5

Information

Ziele, Inhalte und Methode der Lehrveranstaltung

In this seminar, participants will read, present and discuss recent papers on deep learning for natural language processing.

Possible topics to be covered in the seminar:
- Word vectors
- Classical Neural Network Layers
- Attention
- Pre-Trained language models
- Data Sets and Evaluation
- Probing of pre-trained language models
- Model explainability
- Language bias and learned representations
- Relation Extraction and Distant supervision
- Weak Supervision
- Ethical and security aspects of Language Models

Art der Leistungskontrolle und erlaubte Hilfsmittel

=== the information below is preliminary, to be finalized soon ===

Participants will have to present one topic from the list in the seminar, the presentation should be roughly 25 minutes (hard limits: min. 20 minutes, max. 30 minutes). The presentation is followed by a QA session and discussion. Participants will also have to submit a written report (deadline and exact requirements TBD), describing the main contents of the presented paper and putting it in a wider context.

Please send an email to erion.cano@univie.ac.at including a selection of *5 topics* from the list below, and indicate your *study program* (Computer Science, Digital Humanities, ...). You will be assigned one topic from your selection (for your presentation and report). For the additional two topics (also from your selection, but presented by somebody else) you will have to prepare some questions that can get a discussion started.

Please send your email until Wednesday, March 8.

Mindestanforderungen und Beurteilungsmaßstab

=== the information below is preliminary, to be finalized soon ===

Your presentation will account for 45% of the grade, participation in discussions for 10%, and the written report for 45%.

Prüfungsstoff

=== the information below is preliminary, to be finalized soon ===

Your presentation will account for 45% of the grade, participation in discussions for 10%, and the written report for 45%.

Literatur

[A] Word vectors:

[A.1] Mikolov et al. "Distributed representations of words and phrases and their compositionality."
[A.2] Bojanowski et al. "Enriching word vectors with subword information."

[B] Attention:

[B.1] Hermann et al. "Teaching machines to read and comprehend."
[B.2] Bahdanau et al. "Neural machine translation by jointly learning to align and translate."
[B.3] Vaswani et al. "Attention is all you need."

[C] Pre-Trained language models:

[C.1] Devlin et al. "Bert: Pre-training of deep bidirectional transformers for language understanding."
[C.2] Brown et al. "Language Models are Few-Shot Learners."
[C.3] Clark et al. "ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators"
[C.4] Ouyang et al. "Training language models to follow instructions with human feedback"

[D] Data Sets and Evaluation:

[D.1] Kwiatkowski et al. "Natural questions: a benchmark for question answering research."
[D.2] Wang et al. "Glue: A multi-task benchmark and analysis platform for natural language understanding."

[E] Probing of pre-trained language models, explainability:

[E.1] Jawahar et al. "What does BERT learn about the structure of language?"
[E.2] Petroni et al. "Language models as knowledge bases?"
[E.3] Tenney et al. "BERT Rediscovers the Classical NLP Pipeline"
[E.4] Ribeiro et al. ""Why should I trust you?" Explaining the predictions of any classifier."
[E.5] Ribeiro et al. "Beyond Accuracy: Behavioral Testing of NLP Models with CheckList."
[E.6] Bender, Koller: "Climbing towards NLU: On Meaning, Form, and Understanding in the Age of Data"

[F] Language bias and learned representations:

[F.1] Bolukbasi et al. "Man is to computer programmer as woman is to homemaker? debiasing word embeddings."
[F.2] Garg et al. "Word embeddings quantify 100 years of gender and ethnic stereotypes."
[F.3] Sap et al. "Social bias frames: Reasoning about social and power implications of language."
[F.4] Zhao et al. "Gender Bias in Coreference Resolution: Evaluation and Debiasing Methods"

[G] Relation Extraction and Distant supervision:

[G.1] Riedel et al. "Relation extraction with matrix factorization and universal schemas."
[G.2] Verga et al. "Multilingual relation extraction using compositional universal schema."

[H] Weak Supervision:

[H.1] Karamanolakis et al. “Self-Training with Weak Supervision”
[H.2] Fu et al. “Fast and Three-rious: Speeding Up Weak Supervision with Triplet Methods”
[H.3] Zhang et al. “WRENCH: A Comprehensive Benchmark for Weak Supervision”
[H.4] Li et al. “Prefix-Tuning: Optimizing Continuous Prompts for Generation.“
[H.5] Stephan et al. "SepLL: Separating Latent Class Labels from Weak Supervision Noise"

[I] Ethical and security aspects of Language Models

[I.1] Weidinger et al. "Taxonomy of Risks posed by Language Models"
[I.2] Gary Marcus and Ernest Davis. "GPT-3, Bloviator: OpenAI’s language generator has no idea what it’s talking about"
[I.3] Luz de Araujo and Roth "Checking HateCheck: a cross-functional analysis of behaviour-aware learning for hate speech detection"

Zuordnung im Vorlesungsverzeichnis

S-DH Cluster I: Language and Literature

Letzte Änderung: Do 04.07.2024 00:13