Coordinator: Prof. Antonio Vicino
Home |  DIISM |   | Login Privacy e Cookie policy



Data Mining: Practical Machine Learning Tools And Techniques


Ian H. Witten
University of Waikato (New Zeland)
Course Type
Type B
luglio 2008
Data mining is the extraction of explicit, previously unknown, and potentially useful information from data. The machine learning algorithms used for data mining sift through databases automatically, seeking regularities or patterns. Strong patterns, if found, will likely generalize to make accurate predictions on future data.
This introductory course will describe the most common styles of machine learning algorithms used for data mining. It will cover methods of inferring rules and decision trees, statistical modeling, association rules, linear models, instance-based learning, and clustering.
Of particular interest is the way in which machine learning algorithms are evaluated, and we will describe the methodology of training and testing, predicting performance, cross-validation, and other methods of estimating error rates.
The course will have a strong practical component, based on the open source Weka machine learning workbench. This is an extensive collection of state-of-the-art machine learning algorithms and data preprocessing tools presented within a uniform interactive interface. Students will learn how to apply the algorithms in Weka to a wide variety of datasets, and interpret the results.
Data Mining Tutorial Video - Part 1
Data Mining Tutorial Video - Part 2
Data Mining Slides
Extra slides for day 2
Authorship verification paper
WEKA tutorial exerises
Tutorial 1: Introduction to the WEKA Explorer
Tutorial 2: Nearest neighbour learning and decision trees
Tutorial 3: Naive Bayes and support vector machines
Tutorial 4: Preprocessing
Tutorial 5: Text mining
Tutorial 6: Association rules
User classifier competition



PhD Students/Alumni

Dip. Ingegneria dell'Informazione e Scienze Matematiche - Via Roma, 56 53100 SIENA - Italy