340212 VU Speech Technologies (2024S)
Continuous assessment of course work
Labels
Registration/Deregistration
Note: The time of your registration within the registration period has no effect on the allocation of places (no first come, first served).
- Registration is open from Mo 12.02.2024 09:00 to Fr 23.02.2024 17:00
- Registration is open from Mo 11.03.2024 09:00 to Fr 15.03.2024 17:00
- Deregistration possible until Su 31.03.2024 23:59
Details
max. 40 participants
Language: English
Lecturers
Classes (iCal) - next class is marked with N
- Thursday 14.03. 16:45 - 19:00 Medienlabor II ZfT Gymnasiumstraße 50 4.OG
- Thursday 21.03. 16:45 - 19:00 Medienlabor II ZfT Gymnasiumstraße 50 4.OG
- Thursday 11.04. 16:45 - 19:00 Medienlabor II ZfT Gymnasiumstraße 50 4.OG
- Thursday 18.04. 16:45 - 19:00 Medienlabor II ZfT Gymnasiumstraße 50 4.OG
- Thursday 25.04. 16:45 - 19:00 Medienlabor II ZfT Gymnasiumstraße 50 4.OG
- Thursday 02.05. 16:45 - 19:00 Medienlabor II ZfT Gymnasiumstraße 50 4.OG
- Thursday 16.05. 16:45 - 19:00 Medienlabor II ZfT Gymnasiumstraße 50 4.OG
- Thursday 23.05. 16:45 - 19:00 Medienlabor II ZfT Gymnasiumstraße 50 4.OG
- Thursday 06.06. 16:45 - 19:00 Medienlabor II ZfT Gymnasiumstraße 50 4.OG
- Thursday 13.06. 16:45 - 19:00 Medienlabor II ZfT Gymnasiumstraße 50 4.OG
- Thursday 20.06. 16:45 - 19:00 Medienlabor II ZfT Gymnasiumstraße 50 4.OG
Information
Aims, contents and method of the course
Assessment and permitted materials
Exercise 1 (25.4.): Written test with questions from lecture 1-3 (no aids allowed).Exercise 2 (13.6.): Written test with questions from lecture 4-7 (no aids allowed).Programming exercise (Handout on 25.4., Handin on 20.6.): Develop an accent recognition system, that allows for the recognition of the spoken accent from a speech signal, in a group of 3-4 students and present the results.
Minimum requirements and assessment criteria
You have to achieve 50% of the total points for a positive grade.The grade depends on the points for the two exercises (30% each), and on the programming exercise (40%).You have to be present, at most 2 missed lecture units are possible.
Examination topics
Exercise 1 (25.4.): Written test with questions from lecture 1-3 (no aids allowed).Exercise 2 (13.6.): Written test with questions from lecture 4-7 (no aids allowed).
Reading list
D. Jurafsky, J. H. Martin, Speech and Language Processing, https://web.stanford.edu/~jurafsky/slp3/
I. Goodfellow, Y. Bengio, A. Courville, Deep learning. MIT press, 2016.
B. Pfister, T. Kaufmann, Sprachverarbeitung, Springer, 2008.
J. H. McClellan, R. W. Schafer, M. A. Yoder, DSP first: A multimedia approach, Prentice Hall, 1998.
Duda, Richard O. and Hart, Peter E. and Stork, David G., Pattern Classification, 2000.
I. Goodfellow, Y. Bengio, A. Courville, Deep learning. MIT press, 2016.
B. Pfister, T. Kaufmann, Sprachverarbeitung, Springer, 2008.
J. H. McClellan, R. W. Schafer, M. A. Yoder, DSP first: A multimedia approach, Prentice Hall, 1998.
Duda, Richard O. and Hart, Peter E. and Stork, David G., Pattern Classification, 2000.
Association in the course directory
Last modified: Th 25.04.2024 14:46
Lecture 1
1. Introduction
2. Phonetics11.4.:
Lecture 2
3. Signal Processing and classical vocoder
4. Minimum Edit Distance (MED) and Dynamic Time Warping (DTW)18.4.:
Lecture 3
5. Hidden-Markov-models (HMM)
6. N-gram language models25.4.:
Exercise 12.5.:
Lecture 4
7. Vector semantics and embeddings
8. Feed-forward Neural Networks (NN)16.5.:
Lecture 5
9. Convolutional NN, RNN and LSTM
10. Transformer23.5.:
Lecture 6
11. Speech synthesis: DNN based vocoders
12. Speech synthesis: DNN based acoustic models6.6.:
Lecture 7
13. Speech recognition: DNN based acoustic models
14. Speech recognition: DNN based language models13.6.:
Exercise 220.6.:
Programming exerciseMethodology:Theoretical presentation of the basics of the field of language technology.
Development and implementation of a practical application to a current task in the field of the course.
Independent solving of exercises