Skip to main content
Course unit, curriculum year 2024–2025
DATA.STAT.840

Statistical Methods for Text Data Analysis, 5 cr

Tampere University
Teaching periods
Active in period 1 (1.8.2024–20.10.2024)
Active in period 2 (21.10.2024–31.12.2024)
Course code
DATA.STAT.840
Language of instruction
English
Academic years
2024–2025, 2025–2026, 2026–2027
Level of study
Advanced studies
Grading scale
General scale, 0-5
Persons responsible
Responsible teacher:
Jaakko Peltonen
Responsible organisation
Faculty of Information Technology and Communication Sciences 100 %
Coordinating organisation
Computing Sciences Studies 100 %

This course teaches various statistical methods for modeling and analysing text data. Contents are planned to include models for representing text including vector space models and neural embedding models; document content processing stages such as lemmatization and keyphrase extraction; probabilistic models of content variation including n-grams and topic models; neural models of text; and methods for various text analysis tasks.

Learning outcomes
Prerequisites
Learning material
Studies that include this course
Completion option 1
Exercise sets and exam must both be completed to pass the course
Completion of all options is required.

Exam

01.12.2024 31.12.2024
Active in period 2 (21.10.2024–31.12.2024)

Participation in teaching

26.08.2024 08.12.2024
Active in period 1 (1.8.2024–20.10.2024)
Active in period 2 (21.10.2024–31.12.2024)