SGN-24007 Advanced Audio Processing, 5 cr
Additional information
Suitable for postgraduate studies.
Person responsible
Tuomas Virtanen
Lessons
Implementation | Period | Person responsible | Requirements |
SGN-24007 2017-01 | 4 |
Sharath Adavanne Tuomas Virtanen |
Learning Outcomes
This course is an advanced source and audio and speech signal processing, focusing on algorithms that can be used to automatically analyze and classify audio signals, and to do advanced processing to them. After completing this course, the student -Can implement an audio classification system using some common programming language. -Knows what are the most commonly used acoustic features is audio classification, understands what information they represent, and is able to select suitable acoustic features for specific audio analysis tasks. -Knows what are the most commonly used classifiers suitable for audio classification, understands their functioning, and is able to select suitable classifier for specific audio analysis tasks. -Understands the effect of training data and external effects (channel, noise, reverberation) on audio classification systems. -Understands how speech recognition is formulated as a pattern classification problem. -Can list the components of a speech recognition system, and understands the effect of each of them on the recognition performance. -Can identify applications where source separation is used or can be used. Understands the basic techniques used in source separation and will be able to implement some source separation algorithm. -Understands what kind of processing is enabled by a microphone array. Can implement a beamformer and a sound source localization algorithm.
Content
Content | Core content | Complementary knowledge | Specialist knowledge |
1. | Acoustic feature extraction and audio classification. Automatic speech recognition. Use of temporal information in classification: hidden Markov models, recurrent neural networks, connectionist temporal classification, convolutional neural networks. | ||
2. | Source separation (one channel and multichannel). Time-frequency masking. Deep neural network based and spectrogram factorization based source separation techniques. | ||
3. | Microphone array signal processing: beamforming, source localization and tracking. |
Instructions for students on how to achieve the learning outcomes
The course is marked based on the exam. The highest mark is given for correct answers that cover the depth and breadth delivered at the lectures and exercises. The threshold for passing the course is at about half of the maximum amount of points. Bonus points worth a maximum of one mark are given by active participation in weekly exercises. An acceptable project work has to be returned by the deadline, and 30% of the weekly exercises need to be completed.
Assessment scale:
Numerical evaluation scale (0-5)
Prerequisites
Course | Mandatory/Advisable | Description |
SGN-13006 Introduction to Pattern Recognition and Machine Learning | Mandatory | 1 |
SGN-14007 Introduction to Audio Processing | Mandatory | 2 |
1 . Either SGN-13000 or SGN-13006.
2 . Either SGN-14006 or SGN-14007
Correspondence of content
Course | Corresponds course | Description |
SGN-24007 Advanced Audio Processing, 5 cr | SGN-24006 Analysis of Audio, Speech and Music Signals, 5 cr |