I have been paying attention with great interest to the reports of poltergeists, slowness, network issues, and other reports. I try to stay out of rants, and help where I can. Over time I have sensed that I do not seem to have as many issues as others do, and I’m going to try to document why I think that is here in the hopes that it will help someone, even one person. (Note: I have had three actual ST problems, all were routines not running that I knew about right away, and a fourth one I had prior notification that it would not run. Call that scheduled maintenance. I have not rebooted my V2 hub since I installed it). Am I lucky? Maybe, but here is why I think I’m not.
Things that this post is not:
- A short post. I overthink a lot. It’s a function of being an automation engineer early in my career. Happy reading.
- A place to add your ranting and raving. Try to find a post that matches yours already, and post to it to identify the common cause OR create a new one if you don’t find it. Hopefully with the other thread participants you can find a common issue and fix there. Please post reliability hints that I have missed, or poke holes in any of my specific ideas. Don’t tell me about returning your hub, I will only wish you luck with your next experience. I love my hub.
- A guaranteed fix for your problems. These are my ideas, my experience, and it may differ from yours by a lot. Some of these are just theories. And my use case is not yours.
- An indictment of any method, app, devicetype or platform. I work hard to be open minded.
- A statement that you are not having problems and haven’t done all these things. There are actual issues in ST land, this is known.
- As always, official support (even if slow lately) is provided by email@example.com. Always put in a ticket first on a big problem, then come here. The community might fix it first, but ST has to know about a problem (in volume too) before they can mobilize to do anything.
Basics: Here is what I do to keep reliability high on my system.
- Local Processing: In my opinion, if you use the V2 hub, use Smart Lighting as much as possible. It’s quirks are known, and anything known is easy to work around. Anything that processes faster is more likely to succeed. Read further for more details.
- I use ST device types whenever I can. Say what you want, sure sometimes they have bugs, but they are reliable bugs. In other word, they normally don’t cause damage, when they work they work, and when they don’t, they don’t hurt anything. This being said, I have plenty of custom device types.
- Temperature control is “bump” based, not “on/off”. I don’t believe HVAC should ever be turned off (and you can’t in the Northeast in winter). Set low/high temps for off, that way you have a safety blanket. Same for hot water heaters.
- MOST IMPORTANT: DO NOT try to put everything into one rule, one routine, one large “Lighting Automation”. See “Extra Credit” reading bullet 1 below for my theory why.
- I read about every SmartApp I install, and try not to install “new” and shiny things UNLESS I’m ready for testing and weirdness.
- Do not use as a dedicated Alarm System: I have remarked in the community before, that I will NOT put the cloud in between a blaring siren and my intention to turn it off. I may never install a siren just because I don’t think they are useful (for me). I do think ST can support this functionality better in the future with local only processing for native siren devices with local control of that siren (and I’ve seen recent work to that effect). When we are assured of that, I’ll test it. I don’t trust my ISP, the home network I manage, or ST cloud enough to place all that in between a siren in my home and my family.
- I focus on use cases for notification. I have a hard time believing I can predict when my family will want the garage door open or closed, but I can tell with notifications when it is open when we wouldn’t want it to be, and then do something about it remotely. Kids come home and leave it open, I shut it when the notification tells me it’s been open for 10 minutes. This enhances my life, and doesn’t shut the door when I’m outside plowing, or working in the yard. Notifications include me in the process.
- I automate with backups. Always have a backup. All my lights have switches. I will never use a bulb because I can’t always reach a bulb, and I don’t want my family touching bulbs. Switches only for me in my house. I know some like bulbs, but I don’t.
- Never put anything safety related dependent on something not dedicated to that task. My Nests are independent smoke and CO alarms. They work no matter what happens to any network. Safety MUST be rock solid. Love ST, but it’s mission to glue different technologies together means you shouldn’t rely on it to keep your pets, yourself, or loved ones alive. Use it to help, but rely on dedicated devices (which can also fail in their own right, but have a different acceptable rate of failure).
- I test my network regularly, and easily. I simply go to speedtest.net and I expect a 20ms ping or less. I think simple latency is the best easy to test factor for understanding if your network is the problem. I found a router problem I had once this way that wasn’t causing ST any issues, but was causing low family morale. (Side note: It’s amazing what you can accomplish with teenagers if you shut the wifi off for any amount of time).
- Event based state changes. This is my favorite things to do. I’ll try to use a simple example. I have a Rule Machine rule that bumps up my temps if an area is occupied. I set the times lower for activity than I might be tempted because I want the rule to fire more than once just in case my thermostat didn’t get the message the first time. It’s self healing by design, the second message may make it when the first does not. I may come back to this post and add other examples…
- I cut out excessive debug in cloud type apps (actually all apps if I think there is an issue). I don’t know that this helps for sure, but see Extra Credit bullet 1 below for why.
- I read through the code briefly on every app I install. It’s more for learning, and I don’t always understand it, but a couple of things I have picked up have helped me understand things (like how the Honeywell Wifi stats are implemented differently than most in ST). For instance, reading the code for Rule Machine tells me that @bravenel is seriously good with logic and creates concise and powerful code. I trust his code and many others by reputation. I trust my own code less than theirs.
- When ST failed me the three times it did, I used IFTTT to doublecheck my sunset/sunrise. Post on can be found in search. A lot harder for it to miss now.
- Watch your logs. Watch during installation and watch during the early stages of a newly installed application to see what is happening. https://graph.api.smartthings.com/ide/logs.
Extra Credit Reading, why I think some of what I do helps.
- It is well known that ST has a time limit for execution. So if anything takes too long, or hangs in your routine, app, device, etc, as I understand it, ST cloud (and maybe local processing too, I’m not sure) will kill it. Summarily. Regardless of whether it is finished what it is doing. Regardless of the state it leaves your system in.
- Even if it’s not the automatic kill, but just general bugginess, a long running routine/app trying to do 20 things, as opposed to dedicated single purpose apps will always have a higher failure rate. Prioritize higher quality simple rules/apps/routines over lower quantity do it all apps.
- Most of the issues that will happen with ST (by design) are things not happening. Notifications help you track those down, I put notifications on important things while debugging, and always on important things that I have to know happened.
- Adding apps/devices one at a time has really helped me understand what devices or apps were buggy and hurting my system. Add a device, give it a day or two. System gets awful, uninstall, see if it gets better. I’ve been lucky and had only one custom device that gave me trouble, and it was one I wrote. Smartapps that have given me trouble, always just didn’t fire.
- Almost forgot the obvious, the benefit to local processing is that it avoids cloud issues. And there have been a fair bit lately. Other than firmware updates, the hub is fairly well protected from transient cloud issues.
- Dedicated apps/hardware for specific tasks: For instance Life360 for presence. It’s what they do.
I will be back to update this post with ideas added and more of my own, for I am sure that I have forgotten some. I kept telling myself I’d write this to help someone, and my hope is that it helps at least one person.
- 40 plus devices, mixture of zigbee, zwave and wifi
- Zwave: Mixture of brands including GE/Jasco, Dragontech, Enerwave, maybe others.
- Zigbee: Includes SmartThings and Iris. Nothing else that I recall.
- Most complicated SmartApps: Motion Zone Manager and Rule Machine
- Sonos: I have but I don’t use because I’ve heard of issues. Not ready to test that yet.
- Harmony: Do not have, I will not put two hubs in my environment, I’m still annoyed that MyQ has a hubless door opener now.
- Thermostats: Three Honeywell stats, the really nice LCD Redlink ones, I’ll look up the model in a bit.
- My location: Northeast US.
- Presence: I use five fobs (they have always worked perfectly,even when I don’t like the kids yo-yo-ing to and from the house while waiting for the bus), and an instance of Life360 for phones. The Android app is terrible right now, so I won’t trust it with my presence. Life360 is very reliable, albeit slower to respond than fobs. Life360’s sole purpose is notification of arrival/departure, so I trust it more than anything except fobs.
My confirmed issues
- Android App. It dies within 10 actions on my phone (Nexus 6p) and about half as often in Bluestacks. In fact I know it causes issues when it dies in an app, the app is sometimes left unconfigured, only an IDE delete can fix it. And who knows if that leaves a trail of destruction.
- Routines occasionally not firing per my first paragraph.
- Some unofficial integrations of network devices like MyQ, Nest, or Honeywell when the companies change their interface. I’ve had to update each of those items once each in the past year for an update. I consider that normal maintenance at the moment, but more updates would start to be a problem. The community has been awesome in this regard, and is what brought me to ST in the first place.
- Added my configuration.
- Added a bit about network.
- Added my presence information and Life360 recommendation.
- Added my confirmed issues.