• If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.

  • You already know Dokkio is an AI-powered assistant to organize & manage your digital files & messages. Very soon, Dokkio will support Outlook as well as One Drive. Check it out today!



Page history last edited by mike@mbowles.com 13 years ago



Follow-on to the analysis that Charles showed us last night. 


Charles analysis showed that there was a strong correlation between current fraction of air conditioner usage and fraction usage 24 hours ago.  It also suggested some ways to answer the question Joe brought us with the data - "what variables have the most significant effect on the fraction of time that air conditioners are on?"


Formulate the problem as follows.  Use fractional air conditioner usage as the target variable that we're trying to predict.  Include in the attribute set, the past 24 hours of fractional usage and the past 24 hours of the other variables in the data set (temperature, humidity, etc.).  This will result in a long attribute list (24 past usage values + 24 past temperatures + 24 past humidities, etc.)


Use glmnet to regress current fractional usage on the past values of these variables.  Survey both the alpha variable (balance between ridge and L1 penalty) and the lambda variable (weight on penalty) in order to see what values minimise cross-validation test error.  Pick the winning model and print out weights in order to see what variables are most significant in predicting air-conditioner usage. 



Comments (1)

stephen.oconnell said

at 6:43 pm on Feb 15, 2011

Was Charles code posted any where?

You don't have permission to comment on this page.