Skip to main content
You are browsing the curriculum of an upcoming academic year (2024–2025).
Do you want to change to the ongoing academic year?
Course unit, curriculum year 2024–2025
DATA.STAT.840

Statistical Methods for Text Data Analysis, 5 cr

Tampere University
Teaching periods
Active in period 1 (1.8.2024–20.10.2024)
Active in period 2 (21.10.2024–31.12.2024)
Course code
DATA.STAT.840
Language of instruction
English
Academic years
2024–2025, 2025–2026, 2026–2027
Level of study
Advanced studies
Grading scale
General scale, 0-5
Persons responsible
Responsible teacher:
Jaakko Peltonen
Responsible organisation
Faculty of Information Technology and Communication Sciences 100 %
Coordinating organisation
Computing Sciences Studies 100 %

This course teaches various statistical methods for modeling and analysing text data. Contents are planned to include models for representing text including vector space models and neural embedding models; document content processing stages such as lemmatization and keyphrase extraction; probabilistic models of content variation including n-grams and topic models; neural models of text; and methods for various text analysis tasks.

Learning outcomes
Prerequisites
Learning material
Studies that include this course
Completion option 1
Exercise sets and exam must both be completed to pass the course
Completion of all options is required.

Exam

01.12.2024 31.12.2024
Active in period 2 (21.10.2024–31.12.2024)

Participation in teaching

26.08.2024 08.12.2024
Active in period 1 (1.8.2024–20.10.2024)
Active in period 2 (21.10.2024–31.12.2024)