I appreciate your comment, @speteor (Zach)…
Let’s analyze the processing complexity of the problem/solution in theory:
Note: Pardon the lack of mathematical simplicity in this verbose analysis. Assumptions are strewn throughout; alerts to mistakes or challenges of my assumptions or logic are very welcome…
If we run a SmartApp for “record activity” that does an explicit “subscribe()” to the on/off events of a list of many Things that the user selects as the “set of lights that represent home activity”; this is a linear demand (n x number of Things).
I am not sure how the SmartThings event handler is written; lets assume that after every event occurs (i.e., a switch is turned on), the event handler goes through various steps (send the physical command, acknowledge, send message to update the UI icon (I doubt the icon’s “poll” … unless you hit “refresh”), log the event, and then hash the registered event subscription table to see if any SmartApp methods need to be called.).
It is this last item that introduces the scaling problem: Lookup of subscriptions to self. As this table gets large, lookup time certainly will increase: However, if it is efficiently hashed or indexed, the increase in lookup time need not be exponential. From a global system-wide standpoint, we have to presume that the subscription table read/write functionality must be decently handled, because there are thousands (… millions) of SmartApps that are “watching” (subscribed to) 10x or 100x as many Things for events. In other words, adding a “record activity” SmartApp that subscribes to a list of Things will not, itself, increase the load on the subscription table – There are plenty of other uses for subscriptions in many other SmartApps that place an equally large load (e.g., monitoring a set of motion sensors or contact sensors).
Therefore, the only unusual load that this “record activity” SmartApp adds, is the work performed by the method called from the subscription. Again: A method call is not unusual load: Thousands of SmartApps have subscriptions which call methods that run some activity (e.g., contact sensor opens … calls a method within a SmartApp that turns on a light, sets off an alarm, sends a push notification, etc.).
The only unique action of the “record activity” SmartApp is to expand and write to the “state.*” object array (i.e., “persistent data storage”) of the SmartApp instance. One again, though, we can presume this must (or should) be handled efficiently by SmartThings… why? Because writing a piece of data upon an event call is “the same” as what the “event logging” function does; and since event logging is a fundamentally prevalent global activity, SmartThings would already have a major scalability issue if this was not scalable.
Consider the similar characteristics: Event entries belong to specific Things (aside… should I be using “Devices” or “Things” in this terminology, ick – hate using wrong jargon…). Therefore, the Event logs can be split across data stores easily. Searching the event log OF a particular Device is single threaded (does not seem to be any index besides date-time); but thousands of user’s device’s logs can be searched in parallel across multiple processors and data stores. Similarly, the “state.*” persistent store belongs to one and only one particular SmartApp instance; as far as I know, there is no way from one SmartApp to access the “state” of another SmartApp.
The difference: “state." is a base (and extensible( type; unlike "event.”. This means that the “state” table is more complex and has to be hashed to find specific attributes within state, and within arrays of attributes, etc…
Conclusion: Even if there is a magnitude of difference in the read/write requirements of the device event data vs. SmartApp state data, then the system remains scalable.
In other words: I absolutely admire and approve of your code (for this use case) which uses reading devices’s event logs by date range in order to find a list of prior activity. Definitely efficient, since there is no need to create additional subscriptions and history (state) data write workload. The event log is definitely optimized for WRITE activity, however, so there is some risk that READs were not efficiently implemented. Indeed: The code you provided uses a LINEAR scan of the log (within the date range) to search for specific types of events (switch.on; switch.off). There is no hash or index provided to efficiently extract these records.
Conversely: Having a SmartApp create it’s own event log for specific events on specific devices is additional WRITE activity, however it is limited to the specific event types we subscribe to (switch.on, switch.off, setLevel), and the “state” data structure is likely hashed, thus providing efficient READs.
Net Result: Theoretically, neither method is more or less scalable than the other. Hopefully this is true, because there are infinite ways that both subscribe() and state.* read/write are useful in situations where scanning the event table won’t suffice; unlike this particular SmartApp.
Rebuttals or discussions are encouraged!