| 
  • If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.

  • You already know Dokkio is an AI-powered assistant to organize & manage your digital files & messages. Very soon, Dokkio will support Outlook as well as One Drive. Check it out today!

View
 

FrontPage

This version was saved 13 years, 7 months ago View current version     Page history
Saved by mike@mbowles.com
on February 2, 2011 at 2:16:58 pm
 

 

Machine Learning 201

 

 

Organizer: Doug Chang

Instructors: Dr. Michael Bowles & Dr. Patricia Hoffman

 

Overview of the Course

Machine Learning 201 and 202 cover topics in greater depth than 101 and 102.  Participants in the class should come away able to read the current literature and apply what they read to their own work.  Machine Learning 201 and 202 can be taken in any order. 

 

Machine Learning 201 begins with ordinary least squares regression and extends this basic tool in a number of directions.  We'll consider various regularization approaches.  We'll introduce logistic regression and we'll learn how to code categorical inputs and outputs. We'll look at feature space expansions.  These will lead naturally to generalizations of linear regression, known as  the "generalized linear model" and the "generalized additive model". 

 

Text:  "The Elements of Statistical Learning - Data Mining, Inference, and Prediction"  by Trevor Hastie, Robert Tibshirani, and Jerome Friedman

 

See also Prof Robert Tibshirani's notes for stats 315a: http://www-stat.stanford.edu/~tibs/stat315a.html

 

Prerequisites

Machine Learning 201 and 202 employ beginner-level probability, calculus and linear algebra (e.g. preruse the appendices in "Introduction to Data Mining" by Tan et. al. or Linear Algebra, and Probability Theory.)  If you have taken Machine Learning 101 and 102 classes, you are well prepared for this course, but those are not required to start 201.

 

Participants should be familiar with R or be willing to pick R up outside of class.  We will hand out R-code for most of our examples, but we won't spend time in 201 going through introductory material on R.  Come to the first class with R loaded on your computer.  http://cran.r-project.org/  For your review, R are here: References for R,  Reference for R Comments,  More R references.  To integrate R with Eclipse click here

 

To get the most out of the class, participants will need to work through the homework assignments. 

 

General Sequence of Classes:

Machine Learning 101:   Supervised learning

Text: "Introduction to Data Mining", by Pang-Ning Tan, Michael Steinbach and Vipin Kumar

Machine Learning 102Unsupervised Learning and Fault Detection

Text: "Introduction to Data Mining", by Pang-Ning Tan, Michael Steinbach and Vipin Kumar

 

Machine Learning 201:    Advanced Regression Techniques, Generalized Linear Models, and Generalized Additive Models    

Text:  "The Elements of Statistical Learning - Data Mining, Inference, and Prediction"  by Trevor Hastie, Robert Tibshirani, and Jerome Friedman

 

Machine Learning 202:   Collaborative Filtering, Bayesian Belief Networks, and Advanced Trees

Text:  "The Elements of Statistical Learning - Data Mining, Inference, and Prediction"  by Trevor Hastie, Robert Tibshirani, and Jerome Friedman

 

Future Topics 

     Data Mining Social Networks

     Text Mining

     Recommender Methods

     Big Data

 

Machine Learning 201 Syllabus:  

 

Week  Topics  Homework  Links 
       
1st Week  Advanced Regression Topics
 

Lecture 1 and 2

 

      1/12/2011 Ordinary Least Squares - error bounds    
  Subset Select, fwd & backward step-wise    
  Least Angle Regression - LARS    
  Attribute basis change    
      1/13/2011 Coefficient shrinkage methods
Homework01.pdf    
  L1, L2 coefficient penalties
   
  Ridge, lasso and elastic net    
       
       
2nd Week      Regression Topics    Lecture 3 and 4  
    1/19/2011 Logistic Regression
HW #1 Due   
    1/20/2011  Attribute Expansion
Homework02.pdf    
       
   
   
3rd Week  Factor Inputs/Outputs 
 
NotesWeek3  
   1/26/2011  Coding for Factor Inputs
HW #2 Due  
   1/27/2011  Error-correcting codes
 
 
       
4th Week  Generalized Linear Models
  NotesWeek4
    2/2/2011     HW #3 Due   
    2/3/2011       
       
5th Week  Generalized Additive Models
   
   2/9/2011   
HW #4 Due  
   2/10/2011       
       

 

General Calendar for the Year:

 

Fall 2010: Machine Learning 101 &  Machine Learning 102

 

Winter  2011:  Machine Learning 101 &  Machine Learning 201

 

Early Spring 2011:  Machine Learning 102 &  Machine Learning 202

 

 

 

We will be using the following text as a reference for the 201 and 202

 

"The Elements of Statistical Learning - Data Mining, Inference, and Prediction"  by Trevor Hastie, Robert Tibshirani, and Jerome Friedman.  This is an excellent book.  Virtually everyone in the field knows it and uses it as a standard reference.  This book is free to look at on line.  http://www-stat.stanford.edu/~tibs/ElemStatLearn

 

There are more Machine Learning References on Patricia's web site http://patriciahoffmanphd.com/

 

 

Anyone can read this web site, however only the instructors have permission to edit the site. 

If you haven't already filled out the   Register for Class  form on the meet-up page, please fill out the form now.  If you haven't already signed up on the on the meet-up page please do so now.

 

Comments (0)

You don't have permission to comment on this page.