Study Guide 2015-2016

PLA-43106 Data Mining, 5 cr

Person responsible

Teemu Kumpumäki, Pekka Ruusuvuori

Lessons

Implementation 1: PLA-43106 2015-01

Study type P1 P2 P3 P4 P5
Lectures
Excercises
Assignment



 



 
 2 h/week
 2 h/week

+2 h/week
+2 h/week
 2 h/week

Requirements

Active participation and successful completion of exercise works.
Completion parts must belong to the same implementation

Learning Outcomes

The course gives an introduction to data mining and analysis of large datasets. For example, networks and databases involve massive amounts of data, and mining of useful information from data is an increasingly common challenge. By taking the course, the student learns the basic principles and terminology of data mining, knows the commonly used algorithms, and recognizes the typical challenges of processing large datasets. The course will introduce several application areas of data mining, including the principles of web search engines, recommendation systems, and web advertising.

Content

Content Core content Complementary knowledge Specialist knowledge
1. The concept and terminology of data mining.  Knowledge of the basic methods and algorithms.  Knowledge of the limitations of data mining.  
2. Understanding the special principles of processing large, non-structured datasets.   Special challenges of processing large datasets: memory usage and data formats.   Mapreduce algorithm. 
3. Basic principles of web search engines.     

Study material

Type Name Author ISBN URL Additional information Examination material
Book   Mining of Massive Datasets   A. Rajaraman, J. Leskovec, J.D. Ullman         Yes   
Lecture slides     P. Ruusuvuori         Yes   

Correspondence of content

There is no equivalence with any other courses

Last modified 27.04.2015