Data Mining and Knowledge Discovery

Subject description

  • Introduction: introduction to data mining and knowledge discovery in databases, relation with machine learning, visualization of data and models, presentation of the CRISP-DM knowledge discovery methodology.

  • Data mining techniques: decision tree learning, learning classification and association rules, clustering, subgroup discovery, regression tree learning and relational data mining.

  • Evaluation: presentation of search heuristics, heuristics for estimating the quality of induced patterns and models, and methodology for results evaluation.

  • Practical training: practical use of selected data mining tools.

The subject is taught in programs

Objectives and competences

Knowledge discovery in databases is a process of discovering patterns and models, described by rules or other human understandable representation formalisms. The most important step in this process is data mining, performed by using methods, techniques and tools for automated constructions of pattrns and models from data.

The course objectives are to (a) introduce the basics of data mining, the process of knowledge discovery in databases and the CRISP-DM methodology, (b) present selected data mining methods and techniques, (c) present the methodology for result evaluation.

The students will master the basics of data preprocessing, data mining and knowledge discovery and will be capable of using selected data mining tools and results evaluation methods in practice.

Teaching and learning methods

Lectures, consultancy, individual work.

Students need to have access to computers and data mining tools. Use of data mining tools WEKA and Orange4WS is planned.

Expected study results

Knowledge and understanding:

Mastering of selected data mining methods and techniques, capability of data preprocessing, practical use of selected data mining techniques, and capability of using and interpreting the methods for result evaluation.

Basic sources and literature

I. Kononenko: Strojno učenje. FRI, Ljubljana, 1997.

J. Witten, E. Frank: Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann, 2000.

S. Džeroski, N. Lavrač (ur.) Relational Data Mining. Springer 2001.

T. Mitchell: Machine Learning. McGraw Hill, 1997.

M. Berthold, D.J. Hand (ur.), Intelligent Data Analysis: An Introduction, Springer, Berlin-Heidelberg, 1999.

D. Mladenić, N. Lavrač, M. Bohanec, S. Moyle (ur.) Data Mining and Decision Support: Integration and Collaboration. Kluwer 2003.

Stay up to date

University of Ljubljana, Faculty of Electrical Engineering Tržaška cesta 25, 1000 Ljubljana

E: T:  01 4768 411