Warning! The directory is not yet complete and will be amended until the beginning of the term.
340212 VU Speech Technologies (2025S)
Continuous assessment of course work
Labels
Registration/Deregistration
Note: The time of your registration within the registration period has no effect on the allocation of places (no first come, first served).
- Registration is open from Mo 10.02.2025 09:00 to Fr 21.02.2025 17:00
- Registration is open from Mo 10.03.2025 09:00 to Fr 14.03.2025 17:00
- Deregistration possible until Fr 21.03.2025 23:59
Details
max. 40 participants
Language: English
Lecturers
Classes (iCal) - next class is marked with N
The lecture starts on 13.3.
- N Thursday 13.03. 16:45 - 19:00 Medienlabor II ZfT Gymnasiumstraße 50 4.OG
- Thursday 20.03. 16:45 - 19:00 Medienlabor II ZfT Gymnasiumstraße 50 4.OG
- Thursday 27.03. 16:45 - 19:00 Medienlabor II ZfT Gymnasiumstraße 50 4.OG
- Thursday 03.04. 16:45 - 19:00 Medienlabor II ZfT Gymnasiumstraße 50 4.OG
- Thursday 10.04. 16:45 - 19:00 Medienlabor II ZfT Gymnasiumstraße 50 4.OG
- Thursday 08.05. 16:45 - 19:00 Medienlabor II ZfT Gymnasiumstraße 50 4.OG
- Thursday 15.05. 16:45 - 19:00 Medienlabor II ZfT Gymnasiumstraße 50 4.OG
- Thursday 22.05. 16:45 - 19:00 Medienlabor II ZfT Gymnasiumstraße 50 4.OG
- Thursday 05.06. 16:45 - 19:00 Medienlabor II ZfT Gymnasiumstraße 50 4.OG
- Thursday 12.06. 16:45 - 19:00 Medienlabor II ZfT Gymnasiumstraße 50 4.OG
- Thursday 26.06. 16:45 - 19:00 Medienlabor II ZfT Gymnasiumstraße 50 4.OG
Information
Aims, contents and method of the course
Assessment and permitted materials
Exercise 1: Written test with questions from lecture 1-3 (no aids allowed).Exercise 2: Written test with questions from lecture 4-7 (no aids allowed).Programming exercise (Handout on 25.4., Handin on 20.6.): Develop an accent recognition system, that allows for the recognition of the spoken accent from a speech signal, in a group of 3-4 students and present the results.
Minimum requirements and assessment criteria
You have to achieve 50% of the total points for a positive grade.The grade depends on the points for the two exercises (30% each), and on the programming exercise (40%).You have to be present, at most 2 missed lecture units are possible.
Examination topics
Exercise 1: Written test with questions from lecture 1-3 (no aids allowed).Exercise 2: Written test with questions from lecture 4-7 (no aids allowed).
Reading list
D. Jurafsky, J. H. Martin, Speech and Language Processing, https://web.stanford.edu/~jurafsky/slp3/
I. Goodfellow, Y. Bengio, A. Courville, Deep learning. MIT press, 2016.
B. Pfister, T. Kaufmann, Sprachverarbeitung, Springer, 2008.
J. H. McClellan, R. W. Schafer, M. A. Yoder, DSP first: A multimedia approach, Prentice Hall, 1998.
Duda, Richard O. and Hart, Peter E. and Stork, David G., Pattern Classification, 2000.
I. Goodfellow, Y. Bengio, A. Courville, Deep learning. MIT press, 2016.
B. Pfister, T. Kaufmann, Sprachverarbeitung, Springer, 2008.
J. H. McClellan, R. W. Schafer, M. A. Yoder, DSP first: A multimedia approach, Prentice Hall, 1998.
Duda, Richard O. and Hart, Peter E. and Stork, David G., Pattern Classification, 2000.
Association in the course directory
Last modified: Fr 07.02.2025 10:26
Lecture 1
1. Introduction
2. PhoneticsLecture 2
3. Signal Processing and classical vocoder
4. Minimum Edit Distance (MED) and Dynamic Time Warping (DTW)Lecture 3
5. Hidden-Markov-models (HMM)
6. N-gram language modelsExercise 1Lecture 4
7. Vector semantics and embeddings
8. Feed-forward Neural Networks (NN)Lecture 5
9. Convolutional NN, RNN and LSTM
10. TransformerLecture 6
11. Speech synthesis: DNN based vocoders
12. Speech synthesis: DNN based acoustic modelsLecture 7
13. Speech recognition: DNN based acoustic models
14. Speech recognition: DNN based language modelsExercise 2Programming exerciseMethodology:Theoretical presentation of the basics of the field of language technology.
Development and implementation of a practical application to a current task in the field of the course.
Independent solving of exercises