Course Catalog 2007-2008

SGN-5306 KNOWLEDGE MINING, 3 cr
Knowledge Mining

Courses persons responsible
Ari Visa

Lecturers
Ari Visa

Objectives
The course equips the student with a sound understanding of data mining methods and principles and teaches methods for knowledge discovery in large corporate databases.

Content
Content Core content Complementary knowledge Specialist knowledge
1. Concept Description  Data preprocessing
Data Generalization
Summarization-Based Characterization
Analyzing of Attribute Relevance 
  
2. Mining Association Rules  Mining Single-Dimensional Boolean Association Rules, and Multilevel Association Rules, and Multidimensional Association Rules
Correlation Analysis
 
  
3. Descriptive Models  Cluster Analysis
Describing Data by Probability Distributions and Densities
 
Parametric models
Nonparametric models 
4. Predictive Models  Regression models
Stochastic models
Predictive models for classification
Models for structured data 
  

Requirements for completing the course
Assignment and final examination.

Evaluation criteria for the course

  • The examination is based on the final exam and an exercise work. The grading of the execise work is pass/fail.

  • Used assessment scale is numeric (1-5)

  • Study material
    Type Name Auhor ISBN URL Edition, availability... Exam material Language
    Book "Data Mining: Concepts and Techniques" Jiawei Han & Micheline Kamber     Morgan Kaufmann Publisher, 2000 Yes  English 
    Book "Principles of Data Mining" David J. Hand, Heikki Mannila and Padhraic Smyth     MIT Press, 2000 Yes  English 

    Prerequisites
    Code Course Credits M/R
    OHJ-1100 OHJ-1100 Programming I 4 Mandatory
    OHJ-1106 OHJ-1106 Programming I 4 Mandatory
    OHJ-1150 OHJ-1150 Programming II 5 Mandatory
    OHJ-1156 OHJ-1156 Programming II 5 Mandatory
    SGN-1107 SGN-1107 Introductory Signal Processing 4 Recommendable
    SGN-1200 SGN-1200 Signal Processing Methods 4 Recommendable
    SGN-1250 SGN-1250 Signal Processing Applications 4 Recommendable

    Prequisite relations (Sign up to TUT Intranet required)

    Additional information about prerequisites
    Basic programming skills are required.

    Remarks

  • Partial passing of course must be in connection with the same round of implementation.

  • The course is suitable for postgraduate studies.

  • Course will not be lectured in the academic year 2007-2008.

  • Distance learning

  • ITC utilized during the course

  • - In information distribution via homepage, newsgroups or mailing lists, e.g. current issues, timetables
    - In distributing and/or returning exercise work, material etc
    - In the visualization of objects and phenomena, e.g. animations, demonstrations, simulations, video clips

  • Estimate as a percentage of the implementation of the course
  • - Contact teaching: 30 %
    - Distance learning: 0 %
    - Proportion of a student's independent study: 70 %

    Scaling
    Methods of instructionHours
    Lectures 36
    Assignments 23

    Other scaledHours
    Preparation for exam 20
    Exam/midterm exam 3
    Total sum 82

    Principles and starting points related to the instruction and learning of the course

  • Students are encouraged to ask questions both during/after the lectures.

  • Additional information related to course
    Lectures in English or in Finnish.

    Correspondence of content
    8004202 Data Mining

    Course homepage

    Last modified 18.04.2007
    Modified bySari Peltonen