• If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.

  • Buried in cloud files? We can help with Spring cleaning!

    Whether you use Dropbox, Drive, G-Suite, OneDrive, Gmail, Slack, Notion, or all of the above, Dokkio will organize your files for you. Try Dokkio (from the makers of PBworks) for free today.

  • Dokkio (from the makers of PBworks) was #2 on Product Hunt! Check out what people are saying by clicking here.



Page history last edited by mike@mbowles.com 11 years, 3 months ago



Follow-on to the analysis that Charles showed us last night. 


Charles analysis showed that there was a strong correlation between current fraction of air conditioner usage and fraction usage 24 hours ago.  It also suggested some ways to answer the question Joe brought us with the data - "what variables have the most significant effect on the fraction of time that air conditioners are on?"


Formulate the problem as follows.  Use fractional air conditioner usage as the target variable that we're trying to predict.  Include in the attribute set, the past 24 hours of fractional usage and the past 24 hours of the other variables in the data set (temperature, humidity, etc.).  This will result in a long attribute list (24 past usage values + 24 past temperatures + 24 past humidities, etc.)


Use glmnet to regress current fractional usage on the past values of these variables.  Survey both the alpha variable (balance between ridge and L1 penalty) and the lambda variable (weight on penalty) in order to see what values minimise cross-validation test error.  Pick the winning model and print out weights in order to see what variables are most significant in predicting air-conditioner usage. 



Comments (1)

stephen.oconnell said

at 6:43 pm on Feb 15, 2011

Was Charles code posted any where?

You don't have permission to comment on this page.