#

**Machine Learning 201**

#

Instructor: Dr. Michael Bowles & Dr. Patricia Hoffman

**Overview of the Course**

Machine Learning 201 and 202 cover topics in greater depth than 101 and 102. Participants in the class should come away able to read the current literature and apply what they read to their own work. Machine Learning 201 and 202 can be taken in any order.

Machine Learning 201 begins with ordinary least squares regression and extends this basic tool in a number of directions. We'll consider various regularization approaches. We'll introduce logistic regression and we'll learn how to code categorical inputs and outputs. We'll look at feature space expansions. These will lead naturally to generalizations of linear regression, known as the "generalized linear model" and the "generalized additive model".

Text: "The Elements of Statistical Learning - Data Mining, Inference, and Prediction" by Trevor Hastie, Robert Tibshirani, and Jerome Friedman

See also Prof Robert Tibshirani's notes for stats 315a: http://www-stat.stanford.edu/~tibs/stat315a.html

**Prerequisites**

Machine Learning 201 and 202 employ beginner-level probability, calculus and linear algebra (e.g. preruse the appendices in "Introduction to Data Mining" by Tan et. al. or Linear Algebra, and Probability Theory.) If you have taken Machine Learning 101 and 102 classes, you are well prepared for this course, but those are not required to start 201.

Participants should be familiar with R or be willing to pick R up outside of class. We will hand out R-code for most of our examples, but we won't spend time in 201 going through introductory material on R. Come to the first class with R loaded on your computer. http://cran.r-project.org/ For your review, R are here: References for R, Reference for R Comments, More R references. To integrate R with Eclipse click here.

To get the most out of the class, participants will need to work through the homework assignments.

**General Sequence of Classes:**

**Machine Learning 101: ** Supervised learning

Text: "Introduction to Data Mining", by Pang-Ning Tan, Michael Steinbach and Vipin Kumar

**Machine Learning 102: **Unsupervised Learning and Fault Detection

Text: "Introduction to Data Mining", by Pang-Ning Tan, Michael Steinbach and Vipin Kumar

**Machine Learning 201: ** Advanced Regression Techniques, Generalized Linear Models, and Generalized Additive Models

Text: "The Elements of Statistical Learning - Data Mining, Inference, and Prediction" by Trevor Hastie, Robert Tibshirani, and Jerome Friedman

**Machine Learning 202: **Collaborative Filtering, Bayesian Belief Networks, and Advanced Trees

Text: "The Elements of Statistical Learning - Data Mining, Inference, and Prediction" by Trevor Hastie, Robert Tibshirani, and Jerome Friedman

**Machine Learning Big Data: ** Adaptation and execution of machine learning algorithms in the map reduce framework

**Machine Learning Text Processing: ** Machine learning applied to natural language text documents using statistical algorithms including indexing, automatic classification (e.g. spam filtering) part of speech identification, topic and modeling, sentiment extraction

**Future Topics**** **

Data Mining Social Networks

Text Mining

Recommender Methods

Big Data

**Machine Learning 201 Syllabus: **

We will be using the following text as a reference for the 201 and 202

"The Elements of Statistical Learning - Data Mining, Inference, and Prediction" by Trevor Hastie, Robert Tibshirani, and Jerome Friedman. This is an excellent book. Virtually everyone in the field knows it and uses it as a standard reference. This book is free to look at on line. http://www-stat.stanford.edu/~tibs/ElemStatLearn

Anyone can read this web site, however only the instructors have permission to edit the site.

## Comments (0)

You don't have permission to comment on this page.