• If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.

  • Whenever you search in PBworks or on the Web, Dokkio Sidebar (from the makers of PBworks) will run the same search in your Drive, Dropbox, OneDrive, Gmail, Slack, and browsed web pages. Now you can find what you're looking for wherever it lives. Try Dokkio Sidebar for free.



Page history last edited by mike@mbowles.com 12 years, 1 month ago



Follow-on to the analysis that Charles showed us last night. 


Charles analysis showed that there was a strong correlation between current fraction of air conditioner usage and fraction usage 24 hours ago.  It also suggested some ways to answer the question Joe brought us with the data - "what variables have the most significant effect on the fraction of time that air conditioners are on?"


Formulate the problem as follows.  Use fractional air conditioner usage as the target variable that we're trying to predict.  Include in the attribute set, the past 24 hours of fractional usage and the past 24 hours of the other variables in the data set (temperature, humidity, etc.).  This will result in a long attribute list (24 past usage values + 24 past temperatures + 24 past humidities, etc.)


Use glmnet to regress current fractional usage on the past values of these variables.  Survey both the alpha variable (balance between ridge and L1 penalty) and the lambda variable (weight on penalty) in order to see what values minimise cross-validation test error.  Pick the winning model and print out weights in order to see what variables are most significant in predicting air-conditioner usage. 



Comments (1)

stephen.oconnell said

at 6:43 pm on Feb 15, 2011

Was Charles code posted any where?

You don't have permission to comment on this page.