CS2750  Machine Learning (ISSP 2170)


Time:  Monday, Wednesday 2:30-3:50pm, 
Location: Sennott Square, Room 5313


Instructor:  Milos Hauskrecht
Computer Science Department
5329 Sennott Square
phone: x4-8845
e-mail: milos@cs.pitt.edu
office hours: by appointment


TA:  Tomas Singliar
Computer Science Department
5802 Sennot Square
phone: x4-8832
e-mail: tomas@cs.pitt.edu
office hours: MW 10:00-11:30am


Announcements !!!!!

Short quiz:

Term projects
The project reports are due on April 21, 2004 at 12:30pm The term project will be evaluated based on:

The final project report should be structured like a conference paper and should be selfexplanatory. The key learning methods used in the project should be summarized within the report.

See examples of projects submitted by students in past:



Links

Course description
Lectures
Homeworks
Term projects
Matlab



Abstract

The goal of the field of machine learning is to build computer systems that learn from experience and that are capable to adapt to their environments. Learning techniques and methods developed by researchers in this field have been successfully applied to a variety of learning tasks in a broad range of areas, including, for example, text classification, gene discovery, financial forecasting, credit card fraud detection, collaborative filtering, design of adaptive web agents and others.

This introductory machine learning course will give an overview of many models and algorithms used in modern machine learning, including linear models, multi-layer neural networks, support vector machines, density estimation methods, Bayesian belief networks, mixture models, clustering, ensamble methods, and reinforcement learning. The course will give the student the basic ideas and intuition behind these methods, as well as, a more formal understanding of how and why they work. Students will have an opportunity to experiment with machine learning techniques and apply them a selected problem in the context of a term project.

Prerequisites

Knowledge of matrices and linear algebra (CS 0280), probability (CS 1151), statistics (CS 1000), programming (CS 1501) or equivalent, or the permission of the instructor.



Textbook:

Recommended book: Other books we may use:

Lectures
 
 
Lectures  Topic(s)  Assignments
January 5 Introduction.

Readings:

January 7 Designing a learning system.

Readings:

  • DHS: Chapter 1.
  • Data preprocessing. Chapter 3 in Han, Kamber. Data mining. Concepts and Techniques. Morgan Kauffman, 2001. (see Tomas for copies)
  • Optimization. Chapter 6 in Michael Heath. Scientific Computing, McGraw Hill, 1997. (see Tomas for copies)
January 12 Matlab Tutorial.
January 14 Matlab Tutorial
Designing a learning system. Optimization
Homework 1
(Data for HW-1)
Solution for HW-1
January 21 Designing a learning system. Evaluation of classifiers

Some useful references:

Homework 2
(Data for HW-2)
Solution for HW-2
January 26 Density Estimation.

Useful reference:

January 28 Density Estimation II. (includes notes for the Exponential Family) Homework 3
(Data for HW-3)
Solution for HW-3
February 2 Linear regression.

Readings:

.
February 4 Logistic regression.

Readings:

  • HFT book: Chapter 4.1, 4.4.
  • Linear classification Chapter 6 in M. Jordan, C. Bishop. Introduction to graphical models.
Homework 4
(Data for HW-4)
Solution for HW-4
February 9 Classification with linear models

Readings:

  • HFT book: Chapter 4.1-4.3.
  • Linear classification Chapter 6 in M. Jordan, C. Bishop. Introduction to graphical models.
February 11 Multilayer neural networks

Readings:

  • HFT textbook: Chapter 11.
  • Chapter 4 in Tom Mitchell. Machine Learning.
Homework 5
(Data for HW-5)
Solution for HW-5
February 16 Support Vector Machines

Readings:

February 18 The Naive Bayes Classifier. Evaluation of classifiers.

Readings:

Homework 6
(Data for HW-6)
Solution for HW-6
February 23 Multiway classification. Nearest Neighbor classifier.

Readings:

.
February 25 Bayesian belief networks.

Readings:

Homework 7
(Data for HW-7)
Solution for HW-7
March 1 Bayesian belief networks. Inferences.

Readings:

.
March 3 Bayesian belief networks. Learning.

Readings:

.
March 15 Bayesian belief networks. Learning the structure. Learning with hidden variables and missing values.

Readings:

.
March 22 Expectation Maximization (EM)

Readings:

.
March 24 Clustering

Readings:

Homework 8
(Data for HW-8)
March 29 Dimensionality reduction. Feature selection.

Readings:

.
March 31 Principal Component Analysis. Decision trees.

Readings:

  • HFT book: Chapter 14.5. (PCA)
  • HFT book: Chapter 9.2. (Decision trees)
.
April 5 Mixture of experts.

Readings:

.
April 7 Ensamble methods. Bagging and boosting.

Readings:

.
April 12 Reinforcement learning

Readings:

.
April 14 Short quiz + Reinforcement learning (cont.)

Readings:

.
April 21 Term projects due. .



Homeworks

The homework assignments will have mostly a character of projects and will require you to implement some of the learning algorithms covered during lectures. Programming assignmets will be implemented in Matlab. See rules for the submission of programs.

The assignments (both written and programming parts) are due at the beginning of the class on the day specified on the assignment. In general, no extensions will be granted.

Collaborations: You may discuss material with your fellow students, but the report and programs should be written individually.
 



Term projects

The term project is due at the end of the semester and accounts for a significant portion of your grade. You can choose your own problem topic. You will be asked to write a short proposal for the purpose of approval and feedback. The project must have a distinctive and non-trivial learning or adaptive component. In general, a project may consist of a replication of previously published results, design of new learning methods and their testing, or application of machine learning to a domain or a problem of your interest.



Matlab

Matlab is a mathematical tool for numerical computation and manipulation, with excellent graphing capabilities. It provides a great deal of support and capabilities for things you will need to run Machine Learning experiments. Upitt has a number of Matlab licences running on both unix and windows platforms. Click here to find out how to access Matlab at Upitt.

Matlab tutorial files from 01/12/04.

Other Matlab resources on the web:

Online MATLAB  documentation
Online Mathworks documentation including MATLAB toolboxes


Students With Disabilities:
If you have a disability for which you are or may be requesting an accommodation, you are encouraged to contact both your instructor and Disability Resources and Services, 216 William Pitt Union, (412) 648-7890/(412) 383-7355 (TTY), as early as possible in the term. DRS will verify your disability and determine reasonable accomodations for this course.


Course webpage from Spring 2003 and Spring 2002



Last updated by Milos on 01/06/2004