Skip to main content
Course unit, curriculum year 2023–2024
DATA.STAT.840

Statistical Methods for Text Data Analysis, 5 cr

Tampere University
Teaching periods
Active in period 1 (1.8.2023–22.10.2023)
Active in period 2 (23.10.2023–31.12.2023)
Course code
DATA.STAT.840
Language of instruction
English
Academic years
2021–2022, 2022–2023, 2023–2024
Level of study
Advanced studies
Grading scale
General scale, 0-5
Persons responsible
Responsible teacher:
Jaakko Peltonen
Responsible organisation
Faculty of Information Technology and Communication Sciences 100 %
Coordinating organisation
Computing Sciences Studies 100 %

This course teaches various statistical methods for modeling and analysing text data. Contents are planned to include models for representing text including vector space models and neural embedding models; document content processing stages such as lemmatization and keyphrase extraction; probabilistic models of content variation including n-grams and topic models; and methods for various text analysis tasks. The course is in development and detailed contents will be updated.

Learning outcomes
Prerequisites
Learning material
Studies that include this course
Completion option 1
Exercise sets and exam must both be completed to pass the course
Completion of all options is required.

Exam

01.12.2023 31.12.2023
Active in period 2 (23.10.2023–31.12.2023)

Participation in teaching

29.08.2023 12.12.2023
Active in period 1 (1.8.2023–22.10.2023)
Active in period 2 (23.10.2023–31.12.2023)