Theories for why my V2 hub is reliable

bridaus · November 22, 2015, 10:15pm

I have been paying attention with great interest to the reports of poltergeists, slowness, network issues, and other reports. I try to stay out of rants, and help where I can. Over time I have sensed that I do not seem to have as many issues as others do, and I’m going to try to document why I think that is here in the hopes that it will help someone, even one person. (Note: I have had three actual ST problems, all were routines not running that I knew about right away, and a fourth one I had prior notification that it would not run. Call that scheduled maintenance. I have not rebooted my V2 hub since I installed it). Am I lucky? Maybe, but here is why I think I’m not.

Things that this post is not:

A short post. I overthink a lot. It’s a function of being an automation engineer early in my career. Happy reading.
A place to add your ranting and raving. Try to find a post that matches yours already, and post to it to identify the common cause OR create a new one if you don’t find it. Hopefully with the other thread participants you can find a common issue and fix there. Please post reliability hints that I have missed, or poke holes in any of my specific ideas. Don’t tell me about returning your hub, I will only wish you luck with your next experience. I love my hub.
A guaranteed fix for your problems. These are my ideas, my experience, and it may differ from yours by a lot. Some of these are just theories. And my use case is not yours.
An indictment of any method, app, devicetype or platform. I work hard to be open minded.
A statement that you are not having problems and haven’t done all these things. There are actual issues in ST land, this is known.
As always, official support (even if slow lately) is provided by support@smartthings.com. Always put in a ticket first on a big problem, then come here. The community might fix it first, but ST has to know about a problem (in volume too) before they can mobilize to do anything.

Basics: Here is what I do to keep reliability high on my system.

Local Processing: In my opinion, if you use the V2 hub, use Smart Lighting as much as possible. It’s quirks are known, and anything known is easy to work around. Anything that processes faster is more likely to succeed. Read further for more details.
I use ST device types whenever I can. Say what you want, sure sometimes they have bugs, but they are reliable bugs. In other word, they normally don’t cause damage, when they work they work, and when they don’t, they don’t hurt anything. This being said, I have plenty of custom device types.
Temperature control is “bump” based, not “on/off”. I don’t believe HVAC should ever be turned off (and you can’t in the Northeast in winter). Set low/high temps for off, that way you have a safety blanket. Same for hot water heaters.
MOST IMPORTANT: DO NOT try to put everything into one rule, one routine, one large “Lighting Automation”. See “Extra Credit” reading bullet 1 below for my theory why.
I read about every SmartApp I install, and try not to install “new” and shiny things UNLESS I’m ready for testing and weirdness.
Do not use as a dedicated Alarm System: I have remarked in the community before, that I will NOT put the cloud in between a blaring siren and my intention to turn it off. I may never install a siren just because I don’t think they are useful (for me). I do think ST can support this functionality better in the future with local only processing for native siren devices with local control of that siren (and I’ve seen recent work to that effect). When we are assured of that, I’ll test it. I don’t trust my ISP, the home network I manage, or ST cloud enough to place all that in between a siren in my home and my family.
I focus on use cases for notification. I have a hard time believing I can predict when my family will want the garage door open or closed, but I can tell with notifications when it is open when we wouldn’t want it to be, and then do something about it remotely. Kids come home and leave it open, I shut it when the notification tells me it’s been open for 10 minutes. This enhances my life, and doesn’t shut the door when I’m outside plowing, or working in the yard. Notifications include me in the process.
I automate with backups. Always have a backup. All my lights have switches. I will never use a bulb because I can’t always reach a bulb, and I don’t want my family touching bulbs. Switches only for me in my house. I know some like bulbs, but I don’t.
Never put anything safety related dependent on something not dedicated to that task. My Nests are independent smoke and CO alarms. They work no matter what happens to any network. Safety MUST be rock solid. Love ST, but it’s mission to glue different technologies together means you shouldn’t rely on it to keep your pets, yourself, or loved ones alive. Use it to help, but rely on dedicated devices (which can also fail in their own right, but have a different acceptable rate of failure).
I test my network regularly, and easily. I simply go to speedtest.net and I expect a 20ms ping or less. I think simple latency is the best easy to test factor for understanding if your network is the problem. I found a router problem I had once this way that wasn’t causing ST any issues, but was causing low family morale. (Side note: It’s amazing what you can accomplish with teenagers if you shut the wifi off for any amount of time).

Advanced Reading

Event based state changes. This is my favorite things to do. I’ll try to use a simple example. I have a Rule Machine rule that bumps up my temps if an area is occupied. I set the times lower for activity than I might be tempted because I want the rule to fire more than once just in case my thermostat didn’t get the message the first time. It’s self healing by design, the second message may make it when the first does not. I may come back to this post and add other examples…
I cut out excessive debug in cloud type apps (actually all apps if I think there is an issue). I don’t know that this helps for sure, but see Extra Credit bullet 1 below for why.
I read through the code briefly on every app I install. It’s more for learning, and I don’t always understand it, but a couple of things I have picked up have helped me understand things (like how the Honeywell Wifi stats are implemented differently than most in ST). For instance, reading the code for Rule Machine tells me that @bravenel is seriously good with logic and creates concise and powerful code. I trust his code and many others by reputation. I trust my own code less than theirs.
When ST failed me the three times it did, I used IFTTT to doublecheck my sunset/sunrise. Post on can be found in search. A lot harder for it to miss now.
Watch your logs. Watch during installation and watch during the early stages of a newly installed application to see what is happening. https://graph.api.smartthings.com/ide/logs.

Extra Credit Reading, why I think some of what I do helps.

It is well known that ST has a time limit for execution. So if anything takes too long, or hangs in your routine, app, device, etc, as I understand it, ST cloud (and maybe local processing too, I’m not sure) will kill it. Summarily. Regardless of whether it is finished what it is doing. Regardless of the state it leaves your system in.
Even if it’s not the automatic kill, but just general bugginess, a long running routine/app trying to do 20 things, as opposed to dedicated single purpose apps will always have a higher failure rate. Prioritize higher quality simple rules/apps/routines over lower quantity do it all apps.
Most of the issues that will happen with ST (by design) are things not happening. Notifications help you track those down, I put notifications on important things while debugging, and always on important things that I have to know happened.
Adding apps/devices one at a time has really helped me understand what devices or apps were buggy and hurting my system. Add a device, give it a day or two. System gets awful, uninstall, see if it gets better. I’ve been lucky and had only one custom device that gave me trouble, and it was one I wrote. Smartapps that have given me trouble, always just didn’t fire.
Almost forgot the obvious, the benefit to local processing is that it avoids cloud issues. And there have been a fair bit lately. Other than firmware updates, the hub is fairly well protected from transient cloud issues.
Dedicated apps/hardware for specific tasks: For instance Life360 for presence. It’s what they do.

I will be back to update this post with ideas added and more of my own, for I am sure that I have forgotten some. I kept telling myself I’d write this to help someone, and my hope is that it helps at least one person.

My configuration

40 plus devices, mixture of zigbee, zwave and wifi
Zwave: Mixture of brands including GE/Jasco, Dragontech, Enerwave, maybe others.
Zigbee: Includes SmartThings and Iris. Nothing else that I recall.
Most complicated SmartApps: Motion Zone Manager and Rule Machine
Sonos: I have but I don’t use because I’ve heard of issues. Not ready to test that yet.
Harmony: Do not have, I will not put two hubs in my environment, I’m still annoyed that MyQ has a hubless door opener now.
Thermostats: Three Honeywell stats, the really nice LCD Redlink ones, I’ll look up the model in a bit.
My location: Northeast US.
Presence: I use five fobs (they have always worked perfectly,even when I don’t like the kids yo-yo-ing to and from the house while waiting for the bus), and an instance of Life360 for phones. The Android app is terrible right now, so I won’t trust it with my presence. Life360 is very reliable, albeit slower to respond than fobs. Life360’s sole purpose is notification of arrival/departure, so I trust it more than anything except fobs.

My confirmed issues

Android App. It dies within 10 actions on my phone (Nexus 6p) and about half as often in Bluestacks. In fact I know it causes issues when it dies in an app, the app is sometimes left unconfigured, only an IDE delete can fix it. And who knows if that leaves a trail of destruction.
Routines occasionally not firing per my first paragraph.
Some unofficial integrations of network devices like MyQ, Nest, or Honeywell when the companies change their interface. I’ve had to update each of those items once each in the past year for an update. I consider that normal maintenance at the moment, but more updates would start to be a problem. The community has been awesome in this regard, and is what brought me to ST in the first place.

Edit log:

Added my configuration.
Added a bit about network.
Added my presence information and Life360 recommendation.
Added my confirmed issues.

JH1 · November 22, 2015, 11:21pm

How do your habits, such as not turning off HVAC, or not using ST as an alarm - effect the reliability of the system?

BTW, if I recall you posted elsewhere that you don’t use SHM, or Sirens, but you use it to notify you of visitors when you are away and that it has never failed you. How is that not an alarm?

bridaus · November 22, 2015, 11:32pm

Good questions.

Perceived reliability vs required reliability is affected by the criticality of the task. For instance the difference between your dishwasher breaking versus your water main. Think WAF. Pipes breaking versus slightly cool.
Its not blaring a siren in the middle of the night. It just turns on lights and notifies. I even use SHM. Once it notified me and I wasn’t sure who was there so I sent a relative over to check it out. Other times I forgot we had someone coming to clean. To me a true alarm blasts a siren and calls the police/fire. Since that costs real money and time that is not mine, the required reliability goes way up. I’ve had dedicated expensive alarms that false alarmed. ST can not be as good as those.

JH1 · November 22, 2015, 11:38pm

I suppose. One could use the ST hub to draw power at 99.9999999999% reliability or as a paper weight at 100% reliability.

I am a believer in KISS. Certainly, but the system was designed as a home automation and monitoring system. I do want it to do that much.

It does have a siren, but it does not yet have professional monitoring available.

AFAIK - when they do studies of the effectiveness of alarm systems, #1 is the sign, #2 is the siren going off. Monitoring, the response of of police, or your relatives checking things out, is all but useless - statistically speaking. It’s too late.

bridaus · November 22, 2015, 11:42pm

Agree. I believe these days cameras are the most effective. And for that use case, cloud is great.

I have to admit that nowadays when I hear a car alarm going off I think false alarm.

JH1 · November 22, 2015, 11:47pm

I think the siren creates FUD in the intruder, more than it attracts an effective response.

Cameras are great to reduce false alarms and have a chance at catching them after the fact, not sure if they are a deterrent or not. I think they can be, especially at the front door / door bell cameras / as most break ins are through the front door in the middle of the day. Knocking/Leaving Material on your door to see if it remains is a prime trick of the trade.

SBDOBRESCU · November 23, 2015, 12:03am

Brian, I don’t want to spoil your enthusiasm, and I hope that I am the only one who experienced this level of unreliability. But the resolution provided by support for my various issues with failed zigbee devices was to disable my local processing, as documented in this thread

I have too been watching this forum closely, to understand why some people report less problems than others. My conclusion is that there is no single explanation.

One hypothesis is that people who have great experience, don’t use most noticable triggers like lights on sensors. Another is that some people don’t use certain features with stock apps like setting lights levels. Others are not tinkering if something doesn’t work as it supposed to.

The bottom line though, is that ST is trying hard to correct problems on the fly, and while they fix one thing they break another. All of these bugs will end at one point, but until then, is not much we can do. Either sit back and do nothing or chase ghosts and drive yourself crazy.

Some precautionary measures I took, as I am one of the latter category, was to stick with only official apps and devices while ST engineers fix what they need to fix.

kurtsanders · November 23, 2015, 1:20am

It great to finally see someone have a stable V2 hub.

I wish my V2 hub was as reliable as the ST’s marketing literature or their Facebook states for new buyers! I purchased into the SmartThings environment over 18 months ago for dependable execution. Now, I have to settle for it only operating certain low level HA processes (manual off/on lights, open/close/motion detection alerts, etc).

I have two Kwikset Z-wave locks and I have lost confidence that they will reliably lock after the open/close detector from SmartThings reports the door has closed. Several times in the morning, my wife has reported that the front or back door was unlocked, only to look at the debug logs for the SmartApp and the runIn() schedule has not executed after being successfully submitted by the SmartApp. I have coded for decades and I have never had a programming environment be so unpredictable.

I have had my OEM ST Motion detectors fire erratically during the day, causing an “Intrusion Detected” by ST. My Sonos and Harmony remotes now cannot be controlled by SmartThings. I have uncoupled using the presence on our iPhones as this was working about 75% of the time and would not open/close our garage doors.

I hope that ST can get their cloud system back to some stable ecosystem or enable local V2 processing, otherwise, I am on the prowl for the next vendor who can (even if it costs a monthly fee).

geko · November 23, 2015, 1:23am

Totally. When you buy a car, it’s probably no just because it has comfortable seats and airconditioner. You actually expect to drive it.

bridaus · November 23, 2015, 2:07am

I followed your issue. Am I correct that you were the only person I saw with an extensive local processing problem? Everyone else seems to have cloud issues.

SBDOBRESCU · November 23, 2015, 2:55am

Anyone who cared enough to troubleshoot when devices failed? Maybe. There are plenty of stories of ghosts tripping devices for no apparent reason. A posible explanation is a trouble betwen the local and the cloud processing. But I leave that to ST engineers, as they have access to everything and eventually will catch that bug and fix it, if they didn’t already.

Since the last firmware update, I didn’t have any more issues with my local devices not responding to commands. So hopefully the 14.13 update fixed my problems.

JDRoberts · November 23, 2015, 5:59am

I’m glad it’s working for you, and I’m sure the things you’re doing are helpful, but a lot of it is purely luck, including where you live. (We learned from the persistent sunset problems last spring that in some areas there’s just a lot more server traffic, which introduces a lot more possibilities for problems.)

I use almost all plain-vanilla official device types and suggested methods. Almost no custom smart apps or device types.

I do not use SHM.

I have fewer than 40 devices, and use very simple rules.

Consider my two most recent reliability issues. Both began when a new firmware update was pushed, which I had no choice over accepting or not.

The official harmony integration smartapp, which had been working just fine, was replaced by a new one which crashes the mobile app. That’s it. There’s no workaround. It’s just broken. I didn’t ask for it. I didn’t install it. It was pushed to me.
my presence sensor now reports that it’s present all the time. It also does not report battery. Both are symptoms of a problem that has been affecting some UK users, although not all, for about a month. It has now begun to affect some US users.

Note that this is not a case of the sensor reporting that it’s absent when it’s actually at home, which might have something to do with the individual device. Instead, it is being reported present even when it’s absent. That shouldn’t be possible because the way the away indicator works is that the sensor always checks in every 30 seconds and after a certain number of missed check ins, the account logic notes it as being away. So that means this is an account level error. A platform problem.

Again, this had been working fine. I didn’t change anything. I didn’t download anything new. The change was pushed to me which completely removed the functionality of this device.

The only things specific to these two errors which had anything to do with my set up was that I had previously been using the specific devices successfully. Seriously, that’s all. I use plain official device types for both. No custom code. And very simple logic.

So while I think it’s great that you have detailed all of these ideas, and I hope that they will be helpful to other people, most of the instabilities that I have seen come from the platform side. And affect some people with very simple set ups.

(Of course the reason why most of the instabilities I’ve seen appear to come from the new platform side may be because I’m already doing most of the things that you suggested in your note. And after that I get hit with by the bad luck or location. )

You may live in a specific location where there is less server traffic. Or I may just be unluckier than you. But there does appear to be at least as much luck as logic involved.

Submitted with respect.

geko · November 23, 2015, 8:41am

Salesforce.com names Automatic Software Updates as one of the advantages of cloud computing:

The beauty of cloud computing is that the servers are off-premise, out of sight and out of your hair. Suppliers take care of them for you and roll out regular software updates – including security updates – so you don’t have to worry about wasting time maintaining the system yourself. Leaving you free to focus on the things that matter, like growing your business.

Unfortunately, this is only in theory. In practice, SmartThing is a living proof how good ideas can become total disasters when implemented poorly.

bridaus · November 23, 2015, 10:22am

If server location is a factor, then it is a fixable problem. I maintain that based on your use cases and described simple setup that you are a perfect candidate for ST to investigate. Especially a presence sticking. I don’t have a Harmony setup and I probably don’t use the same server, but other than that it sounds like my setup is more complex.

BTW I always assumed from your deep and knowledge filled posts that you had quite the setup going on.

I’d rather not be lucky, but I’ll admit it’s a possible factor here too. I still hope this helps someone even if it can’t help all.

JDRoberts · November 23, 2015, 3:05pm

Thanks for the kind words. SmartThings support has been looking at the issues at my house for several months. They are currently looking into both the presence sensor sticking and why the zwave lock randomly unlocks. They’ve verified both behaviors on their side, they just don’t know why it happens.

This project report details the set up I have:

Adding Home Automation in Phases: my limited investment strategy Projects & Stories

i’ve been working on this over the last year and I’m just finally getting around to writing it up. I’m quadriparetic, use a wheelchair with limited hand function. So home automation isn’t a hobby for me. I want to get maximum return on investment, with a pretty small budget, and I want everything to be working and practical from the beginning. [image] At the same time, my personal belief (and this is just a guess) is that while there will be several good candidates for plug-and-play home automation systems that will do most of what I want for under $5000 by the summer of 2017, The reality is that there is no system yet that will fully meet my requirements. (edited to add we did finally get this by 2019, more details when I get around to updating this for 2020. ) There are ex…

rdelavega · November 23, 2015, 5:44pm

Kudos on your post and reliability, Brian. I have to say that my V2 hub has been reliable (a few isolated quirks) after the pre-Halloween schedule madness. I’m still not fully in love with ST, but that has more to do on the UI side of things and how things work, but that’s another topic.

I want to second some of your thoughts and maybe comment on others:

[quote=“bridaus, post:1, topic:29795”]
I use ST device types whenever I can.
[/quote]I resolved for this. I have a few device types from the community because I migrated from Wink, but I only buy officially supported devices and add official ST SmartApps. Even though there are many great SmartApps from the community (Rule Machine is just one of them), I found them to require too much troubleshooting.

[quote=“bridaus, post:1, topic:29795”]
MOST IMPORTANT: DO NOT try to put everything into one rule, one routine, one large “Lighting Automation”.
[/quote]IMHO, this is just asking for trouble. Small rules and routines is my way to go. Same as a short and steady stream of passes from a quarterback, instead of going for the 50-yard pass.

[quote=“bridaus, post:1, topic:29795”]
Dedicated apps/hardware for specific tasks: For instance Life360 for presence. It’s what they do.
[/quote]Has worked without issues for me and I use it for my Nest thermostat as well. Only thing I’ve noticed, and this might be just my case, but I believe it works better on Android. We’re Android users at home and when my sister came for a visit, I added her to our circle so the lights and HVAC wouldn’t turn off when we were not home. She has an iPhone and only then, Life360 wasn’t very reliable. Don’t know if it’s just me or someone else has noticed this.

[quote=“bridaus, post:1, topic:29795”]
Harmony: Do not have, I will not put two hubs in my environment, I’m still annoyed that MyQ has a hubless door opener now.
[/quote]Seriously reconsider this. Sure, it is another hub, but it’s not a useless one. It actually blasts IR and that’s how it connects your control to the internet. You can’t start to imagine all the complaints I have saved from my wife after getting the Harmony Home Control. After all the AV equipment is nicely tucked away behind cabinets and there’s only one remote to use, life has been a breeze. I can easily say that of all the HA devices, this is my favorite. That and being able to hear the Nest Protect lady before the alarm, and the ability to turn it off without having to fan the smoke detector.

bridaus · November 23, 2015, 7:34pm

See, the real problem with this community is the encouragement to buy and add more devices. Haven’t we learned anything?

I am building a home theater, soooo maybe a harmony is in my future. I had one once, but it broke very quickly (an old 890). I do try to avoid IR as much as possible, it has always been too laggy for me.

Edit: I use an Xbox ONE right now to try and consolidate gaming/TV. It uses IR to change channels on my xfinity X1, and the lag is less than it used to be, but still noticeable. I want more direct control, with apps. The best thing about this setup? The media remotes are $25 a piece and nice. Go ahead and lose it, I can replace anytime and use a game controller or my smart phone in the meantime. Everyone who comes to the house understands the remote and setup just by mashing buttons.

rdelavega · November 23, 2015, 8:34pm

I don’t see a lag for my setup. I have a TV, BD player and a Roku 3, which all use IR (I believe that is how Harmony control the Roku).

bridaus · November 23, 2015, 9:32pm

I’ll admit my experience is a few years old on IR, but you are not helping my budget either.

smart · November 23, 2015, 10:12pm

I pretty much do not use any custom device type or SmartApp (if there is any ST native) except one or two home brewed ones. More than 99% of time routines work for me including which involves mobile presence.I have seen more failures with Smart Lighting and mobile iOS app in general. I have several tickets open for annoyances though. As a rule of thumb I never automated my smart lock or garage door or thermostats with ST. They are only there for quick check on status from a single app.

Biggest failure till date has been the Halloween poltergeist infestation which every body faced (but mine was limited to routines and Smart lighting), Schlage dropping off for no apparent reason, Sonos announcing basement door was closed at 1:35 AM though it was closed may be around 8:30 PM and anything smart lighting related was happening in slow motion last Friday.

EVERYONE’s Mileage is going to vary based on their setup and environment. I stress on YMMV part as just see how many things happen without fail in my goodnight routine but Smart lighting goes nuts at times with the most basic stuff.

Theories for why my V2 hub is reliable

Customers

Developers

Download the SmartThings App