Tampere university of technology - Course Catalog 2013-2014

Course Catalog 2013-2014 International
Basic	Pori	International	Postgraduate	Open University
\|Degrees\| \|Study blocks\| \|Courses\|

Course Catalog 2013-2014

SGN-24006 Analysis of Audio, Speech and Music Signals, 5 cr

Person responsible

Tuomas Virtanen, Serkan Kiranyaz, Anssi Klapuri

Lessons

Study type P1 P2 P3 P4 Summer Implementations Lecture times and places

Lectures
Excercises
Assignment

4 h/week
2 h/week
8 h/per

SGN-24006 2013-01 Tuesday 14 - 16, TB219
Friday 10 - 12, TB110

Requirements

Final exam and project work.

Learning Outcomes

After completing this course, the student will -be able to implement common mid-level data representations used in the analysis of audio signals. He or she will understand how structural regularities of audio signals can be modeled to facilitate their analysis. -be able to implement some some widely-used audio feature extraction techniques and signal analysis algorithms such as spectrogram factorization and multi-pitch analysis. -understand the basic techniques used in speech recognition. He or she will be able to implement the front-end used for extracting relevant information from the speech signal and understand the the mathematical principles and application of hidden Markov models that are used to model the feature sequences.

Content

Content Core content Complementary knowledge Specialist knowledge

1. Mid-level representations of acoustic signals for their content analysis. Modelling of structural regularities of audio signals for analysis purposes.

2. Acoustic feature extraction and audio classification. Spectrogram factorization and other unsupervised learning techniques. Pitch analysis and music transcription.

3. Speech recognition: acoustic feature extraction and hidden Markov models.

Instructions for students on how to achieve the learning outcomes

The course is marked based on the exam. The highest mark is given for correct answers that cover the depth and breadth delivered at the lectures and exercises. The threshold for passing the course is at about half of the maximum amount of points. Bonus points worth a maximum of one mark are given by active participation in weekly exercises. An acceptable project work has to be returned by the deadline.

Assessment scale:

Numerical evaluation scale (1-5) will be used on the course

Study material

Type Name Author ISBN URL Edition, availability, ... Examination material Language

Book Speech and Audio Signal Processing: Processing and Perception of Speech and Music B. Gold, N. Morgan, D. Ellis No    English

Book Spoken Language Processing X. Huang, A. Acero, H.-W. Hon No    English

Lecture slides Yes    English

Online book Lecture Notes for Audio Engineering University of Illinois Urbana-Champaign No    English

Prerequisites

Course Mandatory/Advisable Description

SGN-13000 Introduction to Pattern Recognition and Machine Learning Advisable 1

SGN-13006 Introduction to Pattern Recognition and Machine Learning Advisable 1

SGN-14006 Audio and Speech Processing Mandatory

1 . Either SGN-13000 or SGN-13006 is advisable.

Prerequisite relations (Requires logging in to POP)

Correspondence of content

Course Corresponds course Description

SGN-24006 Analysis of Audio, Speech and Music Signals, 5 cr SGN-4106 Speech Recognition, 5 cr +
SGN-4227 Digital Audio Processing and Analysis, 6 cr

More precise information per implementation

Implementation Description Methods of instruction Implementation

SGN-24006 2013-01

Last modified 17.03.2014

Study type	P1	P2	P3	P4	Summer	Implementations	Lecture times and places
Lectures Excercises Assignment				4 h/week 2 h/week 8 h/per		SGN-24006 2013-01	Tuesday 14 - 16, TB219 Friday 10 - 12, TB110

Content	Core content	Complementary knowledge	Specialist knowledge
1.	Mid-level representations of acoustic signals for their content analysis. Modelling of structural regularities of audio signals for analysis purposes.
2.	Acoustic feature extraction and audio classification. Spectrogram factorization and other unsupervised learning techniques. Pitch analysis and music transcription.
3.	Speech recognition: acoustic feature extraction and hidden Markov models.

Type	Name	Author	ISBN	URL	Edition, availability, ...	Examination material	Language
Book	Speech and Audio Signal Processing: Processing and Perception of Speech and Music	B. Gold, N. Morgan, D. Ellis				No	English
Book	Spoken Language Processing	X. Huang, A. Acero, H.-W. Hon				No	English
Lecture slides						Yes	English
Online book	Lecture Notes for Audio Engineering	University of Illinois Urbana-Champaign				No	English

Course	Mandatory/Advisable	Description
SGN-13000 Introduction to Pattern Recognition and Machine Learning	Advisable	1
SGN-13006 Introduction to Pattern Recognition and Machine Learning	Advisable	1
SGN-14006 Audio and Speech Processing	Mandatory

Course	Corresponds course	Description
SGN-24006 Analysis of Audio, Speech and Music Signals, 5 cr	SGN-4106 Speech Recognition, 5 cr + SGN-4227 Digital Audio Processing and Analysis, 6 cr

Implementation	Description	Methods of instruction	Implementation
SGN-24006 2013-01

Course Catalog 2013-2014 International

Course Catalog 2013-2014