SGN-5306 KNOWLEDGE MINING, 3 cr
|
Courses persons responsible
Ari Visa
Lecturers
Ari Visa
Objectives
The course equips the student with a sound understanding of data mining methods and principles and teaches methods for knowledge discovery in large corporate databases.
Content
Content | Core content | Complementary knowledge | Specialist knowledge |
1. | Concept Description | Data preprocessing
Data Generalization Summarization-Based Characterization Analyzing of Attribute Relevance |
|
2. | Mining Association Rules | Mining Single-Dimensional Boolean Association Rules, and Multilevel Association Rules, and Multidimensional Association Rules
Correlation Analysis |
|
3. | Descriptive Models | Cluster Analysis
Describing Data by Probability Distributions and Densities |
Parametric models
Nonparametric models |
4. | Predictive Models | Regression models
Stochastic models Predictive models for classification Models for structured data |
Requirements for completing the course
Assignment and final examination.
Evaluation criteria for the course
Study material
Type | Name | Auhor | ISBN | URL | Edition, availability... | Exam material | Language |
Book | "Data Mining: Concepts and Techniques" | Jiawei Han & Micheline Kamber | Morgan Kaufmann Publisher, 2000 | Yes | English | ||
Book | "Principles of Data Mining" | David J. Hand, Heikki Mannila and Padhraic Smyth | MIT Press, 2000 | Yes | English |
Prerequisites
Code | Course | Credits | M/R |
OHJ-1100 | OHJ-1100 Programming I | 4 | Mandatory |
OHJ-1106 | OHJ-1106 Programming I | 4 | Mandatory |
OHJ-1150 | OHJ-1150 Programming II | 5 | Mandatory |
OHJ-1156 | OHJ-1156 Programming II | 5 | Mandatory |
SGN-1107 | SGN-1107 Introductory Signal Processing | 4 | Recommendable |
SGN-1200 | SGN-1200 Signal Processing Methods | 4 | Recommendable |
SGN-1250 | SGN-1250 Signal Processing Applications | 4 | Recommendable |
Prequisite relations (Sign up to TUT Intranet required)
Additional information about prerequisites
Basic programming skills are required.
Remarks
Distance learning
- In information distribution via homepage, newsgroups or mailing lists, e.g. current issues, timetables
- In distributing and/or returning exercise work, material etc
- In the visualization of objects and phenomena, e.g. animations, demonstrations, simulations, video clips
- Contact teaching: 30 %
- Distance learning: 0 %
- Proportion of a student's independent study: 70 %
Scaling
Methods of instruction | Hours |
Lectures | 36 |
Assignments | 23 |
Other scaled | Hours |
Preparation for exam | 20 |
Exam/midterm exam | 3 |
Total sum | 82 |
Principles and starting points related to the instruction and learning of the course
Additional information related to course
Lectures in English or in Finnish.
Correspondence of content
8004202 Data Mining
Last modified | 18.04.2007 |
Modified by | Sari Peltonen |