If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.
You already know Dokkio is an AI-powered assistant to organize & manage your digital files & messages. Very soon, Dokkio will support Outlook as well as One Drive. Check it out today!

FrontPage

This version was saved 12 years, 10 months ago View current version Page history

Saved by hoffman.tricia@gmail.com
on June 8, 2011 at 9:10:35 am

Machine Learning 201

Instructors: Dr. Michael Bowles & Dr. Patricia Hoffman

If you want to join the class email - please fill out this form

Overview of the Course

Machine Learning 201 and 202 cover topics in greater depth than 101 and 102. Participants in the class should come away able to read the current literature and apply what they read to their own work. Machine Learning 201 and 202 can be taken in any order.

Machine Learning 201 begins with ordinary least squares regression and extends this basic tool in a number of directions. We'll consider various regularization approaches. We'll introduce logistic regression and we'll learn how to code categorical inputs and outputs. We'll look at feature space expansions. These will lead naturally to generalizations of linear regression, known as the "generalized linear model" and the "generalized additive model".

Text: "The Elements of Statistical Learning - Data Mining, Inference, and Prediction" by Trevor Hastie, Robert Tibshirani, and Jerome Friedman

See also Prof Robert Tibshirani's notes for stats 315a: http://www-stat.stanford.edu/~tibs/stat315a.html

Prerequisites

Machine Learning 201 and 202 employ beginner-level probability, calculus and linear algebra (e.g. preruse the appendices in "Introduction to Data Mining" by Tan et. al. or Linear Algebra, and Probability Theory.) If you have taken Machine Learning 101 and 102 classes, you are well prepared for this course, but those are not required to start 201.

Participants should be familiar with R or be willing to pick R up outside of class. We will hand out R-code for most of our examples, but we won't spend time in 201 going through introductory material on R. Come to the first class with R loaded on your computer. http://cran.r-project.org/ For your review, R are here: References for R, Reference for R Comments, More R references. To integrate R with Eclipse click here.

To get the most out of the class, participants will need to work through the homework assignments.

General Sequence of Classes:

Machine Learning 101: Supervised learning

Text: "Introduction to Data Mining", by Pang-Ning Tan, Michael Steinbach and Vipin Kumar

Machine Learning 102: Unsupervised Learning and Fault Detection

Text: "Introduction to Data Mining", by Pang-Ning Tan, Michael Steinbach and Vipin Kumar

Machine Learning 201: Advanced Regression Techniques, Generalized Linear Models, and Generalized Additive Models

Text: "The Elements of Statistical Learning - Data Mining, Inference, and Prediction" by Trevor Hastie, Robert Tibshirani, and Jerome Friedman

Machine Learning 202: Collaborative Filtering, Bayesian Belief Networks, and Advanced Trees

Text: "The Elements of Statistical Learning - Data Mining, Inference, and Prediction" by Trevor Hastie, Robert Tibshirani, and Jerome Friedman

Machine Learning Big Data: Adaptation and execution of machine learning algorithms in the map reduce framework.

Future Topics

Data Mining Social Networks

Text Mining

Recommender Methods

Big Data

Machine Learning 201 Syllabus:

Week	Topics	Homework	Links

1st Week	Advanced Regression Topics		Lecture 1 and 2
6/1/2011	Ordinary Least Squares - error bounds
	Subset Select, fwd & backward step-wise
	Least Angle Regression - LARS
	Attribute basis change
6/2/2011	Coefficient shrinkage methods	Homework01.pdf
	L1, L2 coefficient penalties
	Ridge, lasso and elastic net


2nd Week	Regression Topics		Lecture 3 and 4
6/8/2011	Logistic Regression	HW #1 Due
6/9/2011	Attribute Expansion	Homework02.pdf


3rd Week	Factor Inputs/Outputs		NotesWeek3
6/15/2011	Coding for Factor Inputs	HW #2 Due
6/16/2011	Error-correcting codes	Homework03.pdf

4th Week	Generalized Linear Models		NotesWeek4
6/22/2011		HW #3 Due
6/23/2011		Homework04.pdf

5th Week	Paper on glmnet	http://www.jstatsoft.org/v33/i01/paper	NotesWeek5
6/29/2011		HW #4 Due
6/30/2011		Homework5

We will be using the following text as a reference for the 201 and 202

"The Elements of Statistical Learning - Data Mining, Inference, and Prediction" by Trevor Hastie, Robert Tibshirani, and Jerome Friedman. This is an excellent book. Virtually everyone in the field knows it and uses it as a standard reference. This book is free to look at on line. http://www-stat.stanford.edu/~tibs/ElemStatLearn

Anyone can read this web site, however only the instructors have permission to edit the site.

Comments (0)

You don't have permission to comment on this page.

To join this workspace, request access.

Already have an account? Log in!

Loading…

This is your Sidebar, which you can edit like any other page in your workspace.

This Sidebar appears everywhere on your workspace. Add to it whatever you like -- a navigation section, a link to your favorite web sites, or anything else.

Loading…

FrontPage

FrontPage

Page Tools

Insert links

Comments (0)

Join this workspace

Navigator

SideBar

Recent Activity