Hi all! About 9 months ago I decided to start an experiment with my ST setup to see if I could do some data mining and learn anything interesting about our household habits. After experimenting with a few different platforms for data logging, I decided the best bet was to simply use IFTTT to store events into my Google Drive. From there, I was able to pull out the spreadsheet entries and import them into MATLAB, a scientific data analysis program. Once in MATLAB, it was literally one line of code to convert the spreadsheet data into a data matrix with all entries in an easy-to-analyze format.
For my first test case, I decided to look at the open/close events coming from our front door. I figured it might give some insight on how regular my schedule is going to/from work, or maybe shed some other interesting insights on daily activity around here. Let me throw up the plots, then Iāll explain below what they are and what Iāve learned (spoiler alert: donāt expect to be surprised)
The first plot shows all 5,266 events logged over the course of 9 months (corresponding to about 20 open/close events per day). I set the y-axis to be the day of the month that the event occurred, so it visually breaks up the individual months in a clean fashion. The first bit all the way at the beginning of the plot is back in August, and youāll notice in September (the next line) thereās a couple week gap. At that point, I was still fiddling with my data logging options and had the IFTTT channel deactivated. Once I came back to IFTTT as my main tool for this project, the data goes unbroken all the way up until early May (the last little line all the way on the right). If you look closely at each of the months (each diagonal line is a month), youāll notice that theyāre a little jagged. Thatās because on each day the door open & closed some pseudo-random number of times. The days we opened the door more frequently, thereās a longer horizontal dash. The days we opened the door less, itās a bit shorter. One thing you can see is that in November and December the trend becomes less steep toward the end of the month, meaning the door was open/closed more times on a given day. Since this lines up with the holiday season, I suspect these trends come from the fact that we had company over, were home more instead of at work, and as a consequence, more active in our comings-and-goings. In Jan, Feb, and March, the data is much more linear with an essentially constant slope and is generally steeper than the holiday season events. Since it gets very cold here in Boston, I take this as empirical confirmation that we went out the bare minimum during those months!
The next two plots are histograms of the time that the door was opened & closed. The lower left plot is hours and the lower right is minutes. The histograms have been normalized and multiplied by 100 so we can read them as probabilities in percentage points. So for example, looking over the entire 9 month period, thereās about a 7% chance the door was opened at 9:00, a 7.5% chance it was opened at 10:00, a 5.5% chance it was opened at 11:00, etc etc. Here, Iām using a 24 hour scale to avoid any AM/PM nonsense. Two things jump out at me from this data. First, we are definitely not coming or going between 1 and 6 am. Second, the dog doesnāt get walked as regularly as I thought! Looking at the data, the uptick around 8 and 9 is typically when I leave for work. The peak at 10 is likely my wife walking the dog and going about her business. The second burst of activity between 18:00 and 21:00 (6:00 pm and 9:00 pm) is when weāre both home from work, going for walks with the dog, taking out the garbage, etc etc. My expectation was that these morning and evening peaks would be more pronounced with the afternoonās having less activity, but I guess not!
For the third plot in the lower right, I decided to do a ācontrolā and instead of looking at what hour the door was opened, what minute it was opened. For this, opening the door at 6:06 and 10:06 are equivalent since the specific value of the minutes are identical. My first thought was that I might see peaks in the 45 to 50s, thinking that maybe we leave 10 to 15 minutes early when we have to be somewhere at X oāclock sharp. Then I thought a bit harder and realized that our life just isnāt that regular and instead suspected the plot should be more flat. Looking at the data, we see that indeed, itās basically flat. Sure thereās blips and bumps, but in general, everything hovers between 1.5 and 2% (note that 100% / 60 = 1.66%, which is exactly in the right range!). Thereās a chance I could convince myself that activity steadily decreases for the first 35 minutes of the hour then increases as we get closer to the end of the hour, but the dataās pretty noisy and Iām a bit skeptical. So, whatās the take away on this? Our comings-and-goings are almost uniformly distributed throughout the hour!
ā¦
Well, I was writing this up and then I realized that my analysis of the hours at which the door is opened is flawed ā weekdays and weekends are intrinsically different for our lifestlye! Duh! So, I went back and separated out different datasets and hereās the breakdown:
Now things make more sense! On weekdays (blue curve) we have the morning activity around 8 to 10 when weāre heading out for the day, and again another large peak for the evening activity between 6 and 9. On weekends, however, thereās a much steadier pace of in-and-out throughout the day (orange curve). In the original plot of hourly activity, this lack of distinction mushed all the data together and suppressed the distinction. I think my big surprise is that weāre not more active later on the weekend evenings, but given that weāre parents of a young kid, I guess those days are behind usā¦ And thereās evidence to prove it!
So whatās next? I started logging more devices in our ST setup. While this little project was fun and cute, I think thereās more serious analyses and applications for this type of data. For example, I think the next project will look for correlations between pairs of motion sensors, or motion sensors and open/close sensors. The idea is to ask questions like āif X door is opened at such-and-such time of day, can I predict if Iāll be going in or out of the room?ā āHow active are the cats at night?ā and āhow does household activity depend on whoās at home?āā¦ Needless to say, I think thereās a lot of fun to be had! Any suggestions, questions, or comments are more then welcomed ā this project is a work-in-progress!