PLA-43106 Data Mining, 5 cr

Vastuuhenkilö

Pekka Ruusuvuori, Teemu Kumpumäki

Opetus

Toteutuskerta Periodi Vastuuhenkilö Suoritusvaatimukset
PLA-43106 2017-01 3 - 4 Teemu Kumpumäki
Pekka Ruusuvuori
Active participation and successful completion of exercise works.

Osaamistavoitteet

The course gives an introduction to data mining and analysis of large datasets. For example, networks and databases involve massive amounts of data, and mining of useful information from data is an increasingly common challenge. By taking the course, the student learns the basic principles and terminology of data mining, knows the commonly used algorithms, and recognizes the typical challenges of processing large datasets. The course will introduce several application areas of data mining, including the principles of web search engines, recommendation systems, and web advertising.

Sisältö

Sisältö Ydinsisältö Täydentävä tietämys Erityistietämys
1. The concept and terminology of data mining.  Knowledge of the basic methods and algorithms.  Knowledge of the limitations of data mining.  
2. Understanding the special principles of processing large, non-structured datasets.   Special challenges of processing large datasets: memory usage and data formats.   Mapreduce algorithm. 
3. Basic principles of web search engines.     

Oppimateriaali

Tyyppi Nimi Tekijä ISBN URL Lisätiedot Tenttimateriaali
Book   Mining of Massive Datasets   A. Rajaraman, J. Leskovec, J.D. Ullman         Yes   
Lecture slides     P. Ruusuvuori         Yes   

Vastaavuudet

Opintojakso ei vastaan mitään toista opintojaksoa

Päivittäjä: Baggström Minna, 31.03.2017