Devices Jumping On & Offline, Crashing Routines & App, Even Hub

I’ve been on and off the phone with Samsung over the course of the past week due to a problem with devices randomly jumping on and off line—often locking up ST routines, WebCoRE pistons, and even preventing control of online devices from the ST app. In one instance, the hub itself crashed completely. The devices may or may not come back online by themselves later, but more often than not I am forced to reboot the hub or physically turn the switches on and off to get them back.

What is most curious about this to me, is that the only devices that ever go offline are my five Z-Wave Plus devices. All of my standard Z-Wave and other devices (doorbells, sprinklers, Sonos, etc.) have no trouble.

Working on my own, and with Samsung’s help, I’ve tried literally dozens and dozens of different things to figure this out. From rebooting the hub, to removing and replacing devices, to disabling Device Health, to physically moving switches—the list goes on and on and on. The customer support representatives I’ve spoken with have been both sympathetic and diligent in their efforts to resolve this, but this issue clearly has them baffled. I’ve been promised the issue will be escalated still further, but I’m not holding my breath on that one.

Their last-gasp Hail Mary is to send me a replacement hub which is a major PITA because ST has never created a workable Backup & Restore solution. I’ve read through enough of the hub-to-hub migration posts here to know that I’m going to lose most of a weekend replacing a hub that may or may not actually be part of the problem.

So here’s my last-gasp Hail Mary to the community: Anybody have any ideas about this? I’ve talk and googled and tinkered my brains out on this one, and like Samsung am officially out of ideas.

What is the result of your environment if you completely remove these 5 Z-Wave Plus devices from the hub? Did you troubleshoot things with those taken completely out of the equation to see if it’s causing these stability issues? That’s one 20 yard crossing route you can try before going for that last second Hail Mary.

So in running that play, if the pass is incomplete, the next play I would run is a quarterback draw and go into webCoRE Settings and select Disable All Pistons.

I would back track this and take things out of the equation 1 by 1 until I can get some points on the board. That’s just my playbook of course :grin:

Just to have one more thing out of the picture. Change the devices to a genetic DTH and enable insecure join in IDE.

Since he mentioned Z-Wave Plus and that those were the only devices going offline, I am suggesting that he get rid of them completely as Z-Wave Plus is not supported and I don’t know if it’s possible that these devices in of themselves is causing his entire system to be unstable. Wanted to rule hem out completely first. As part of re-implemented things after a source is identified, he can then add them in one a time and assign a generic DTH. Need to find root cause first. Just my way of troubleshooting.

1 Like

The Samsung SmartThings Hub (Hub v2) is Z-Wave Plus Certified by the Z-Wave Alliance.

This is a known issue and has been plaguing my system since the last firmware came out. Here’s an event log except from this morning when the “Goodbye!” routine ran. The hub crashed. This happens with virtually all routines, almost anytime one is run!

2018-01-02 10:17:11.297 AM EST
5 hours ago	HUB		zwStatus	ready		Z-Wave is ready
2018-01-02 10:17:07.522 AM EST
5 hours ago	HUB		hubStatus	zw_radio_on		
2018-01-02 10:17:07.518 AM EST
5 hours ago	HUB		hubStatus	zb_radio_off		
2018-01-02 10:17:07.515 AM EST
5 hours ago	HUB		hubInfo	hardwareID:000D, version:14, ...		hardwareID:000D, version:14, mac:D0:52:A8:91:0D:6C, localip:192.168.7.30, localSrvPortTCP:39500, localSrvPortUDP:0, zigbeeFWMa...
2018-01-02 10:17:01.758 AM EST
5 hours ago	HUB		hubStatus	zw_radio_on		
2018-01-02 10:17:01.753 AM EST
5 hours ago	HUB		hubStatus	zb_radio_off		
2018-01-02 10:17:01.749 AM EST
5 hours ago	HUB		hubInfo	hardwareID:000D, version:14, ...		hardwareID:000D, version:14, mac:D0:52:A8:91:0D:6C, localip:192.168.7.30, localSrvPortTCP:39500, localSrvPortUDP:0, zigbeeFWMa...
2018-01-02 10:16:48.258 AM EST
5 hours ago	HUB		register	register		register
2018-01-02 10:16:48.255 AM EST
5 hours ago	HUB		ping	ping		ping
2018-01-02 10:16:48.189 AM EST
5 hours ago	HUB		activity	active		Your SmartThings Hub at Home is now active.
2018-01-02 10:16:48.185 AM EST
5 hours ago	HUB		hubStatus	active		Your SmartThings Hub at Home is now active.
2018-01-02 10:16:07.218 AM EST
5 hours ago	HUB		hubStatus	disconnected		Your SmartThings Hub at Home is now disconnected. Please check your internet connection.

Ticket 463852. I’ve been working with Kiannish & Nate as well as @tpmanley on this. Hoping there’s a fix soon.

I stand corrected on the Plus.

Steve are the devices that are plaguing you all Plus? I know from your other threads you have been having a hell of a time as well.

Most of my troublesome devices are either those pesky Iris Smart Plugs (Z-Wave Plus) or GE Outdoor or in-wall switches (non-Plus). I have a theory as to what is happening but will need some engineering assistance from SmartThings. I’m taking @JDRoberts advice and ordering an Aeon Z-Stick to attempt to see the routing table and verify whether the Iris devices really work.

My theory, for what little it is probably worth, is that there is a command timing issue in the Z-Wave implementation that is causing messages to be sent too rapidly and overloading the network and/or stack. I’m also suspect that enough time is being allotted before a command is being considered to be lost. But those are just my relatively weak theories based on observation.

1 Like

For what it is worth, I have a couple ZW5 devices that do not seem to update properly. I did notice once when re-pairing it that it said something about a security key exchange. Removing the device, getting the hub as close as possible and then re-paired SEEMED to fix it.

I would do everything you can to not get a new hub…that is going to be a lot of work to re-pair everything.

Agreed. Based on everything I’ve read, not only is the migration complicated and time-consuming, it seems unlikely to resolve the problem. I’ve asked Samsung to hold the hub replacement until I’ve done more investigation. BTW, I did repair my devices, but never saw any note about the security exchange. I’ll look again.

@WB70 I had tried removing various various combinations of the Z-Wave Plus devices, but hadn’t though to try removing them all at once because my assumption was that without those devices going offline nothing else bad would happen. That’s probably worth confirming. Also, I did disable any WebCoRE pistons that were created recently before the problem start, but it’s worth a try to disable them all.

@Navat604 Since these are Z-Wave and not Zigbee devices, how would enabling insecure rejoin make a difference here? Also, fortunately or unfortunately, these devices are already on a generic DTH supplied by ST.

@SteveWhite Thanks for the ticket number. I’ll add that into my communications with Samsung. Your problem looks a little different, but may be close enough that they’re related. Oddly enough, my non-plus GE indoor wall switches have (so far) not given me any trouble.

Thank you everybody for the input!!

1 Like

Zwave plus does a lot of things differently than regular Z wave, in particular explorer frames. So it’s not impossible that there’s something going wrong there. But I would suspect either the hub itself or one of the individual other devices.

Speaking as a former field tech, I would start by removing all five of the Z wave plus devices. Then run a Z wave repair and see if you get any error messages.

If not, then just leave things alone for two days and see if you still have problems with the devices jumping on and off. If you do, then, yeah, you probably have to replace the hub. Which is bad, but at least you’ll know it’s required.

If you don’t have any problems, then add in one of the Z wave plus devices but put it close to the hub. Run a zwave repair. If it’s clean, leave the device alone for two days. Then move it at least one hop away and run another Z wave repair. If it’s clean and after two days you have no problems, move on and add a second device. Again, start with it close to the hub.

Obviously it’s going to take two weeks to work through all your devices this way, but at least you don’t have to change all your pistons and if you do have a bad device that should isolate it.

But there are a lot of different ways to approach this problem, that would just be my first approach.

1 Like

@JDRoberts Thank you for the suggestions. It does appear to be a long process and I just hope I can figure it out before my wife shoots me. :grinning: Since I can’t move some of the devices, I guess I’ll have to get a long ethernet cable and move the hub.

In the meantime, unfortunately, support has made my life even more difficult. In an email yesterday, they said “We’ve taken a look on our end and we noticed some oddities on the Location portion. We’ve gone ahead and corrected them for you.

I didn’t notice immediately, because I had disabled all my pistons as a test, but now WebCoRE is a mess because the timezone and zip code are missing from my hub location. The time that shows up in the IDE is now UTC, and pistons are either reporting the wrong time or unable to find weather details and alerts because of the lack of location information.

I tried moving, saving, then resetting the location in the ST App. But that didn’t help. I’m contacting support (again) to see if I can at least get that fixed.

Wow, everything has been great for so long and now I’m herding cats …

I wouldn’t move the hub at this point, that adds too many variables all at once.

Instead, if there’s a device that can’t be moved, like a light switch, then just leave it off the network until last. If it’s a Z wave plus device, you should be able to add it in place anyway. But we’re looking for oddities so let’s take the easier stuff first. I wouldn’t move the hub at this point, that adds too many variables all at once.

Instead, if there’s a device that can’t be moved, like a light switch, then just leave it off the network until last. If it’s Z wave plus device, you should be able to add it in place anyway. But we’re looking for oddities so let’s take the easier stuff first. :sunglasses:

1 Like

I’m seeing some similar behaviors. I have a first gen hub and quite a few devices keep falling off the network. rebooting the smartthings hub (or rebooting it’s internet connection) seems to bring them back on line temporarily, but once it gets in to the state that a bunch of devices aren’t available the routines aren’t running correctly. Some of the device commands are ignored, some aren’t and it doesn’t appear to be related to the device availability state. For example, my thermostat says it’s online, but it’s not getting the temp changes when a routine is run. The biggest issue is one of the door sensors likes to report the door as open when it gets in this states, setting off the alarm. Hubby is not amused.

Looking at the list of unavailable devices they appear to be all of my GE in wall switches and two GE plug in lamp modules.

Suggestion to better troubleshoot?

One step at a time. Troubleshooting 101 in my world.

Difficult to to find source or root cause when trying to perform 3 things at a time. :grinning:

1 Like

Sorry, the comment about the long ethernet cable sounded facetious in my head but clearly should have been accompanied by a second emoji or just left unsaid.One of the devices in play is around several corners and up a long flight of stairs (although it ends up being almost directly above the hub) and a long cable was never really a practical idea.

Just a curiosity (and pardon my ignorance since I’ve never really made a study of Z-wave technology): If a device goes offline, why would rebooting the hub restore it?

Because SmartThings is primarily a cloud – based system. It’s not just a Z wave controller. In fact, “device health” is entirely a cloud – based feature, as the “off-line” status is maintained in the cloud account based on the last time Device status was reported. And of course all rules run in the cloud except some (but not all) of the official smart lighting feature and a few bits of smart home monitor.

So sometimes, but not always, rebooting the hub can resync the hub and the cloud account and clean up some issues that way. But it’s a smartthings idiosyncrasy, not because of zwave itself.

1 Like

I wouldn’t think this would affect you since it seems to be mostly the newest accounts, but there is a major platform outage that started yesterday and is continuing. It’s affecting many users, but not all, with delayed or lost switch commands among other things. So we should mention that one troubleshooting step is to check the first bug reports in the community – created wiki ( for early notification and links to community discussion)

http://thingsthataresmart.wiki/index.php?title=Bug:_First_Reports

And the official status page, which doesn’t list everything and often doesn’t get updated until the solution is found, but can confirm problems existed:

https://status.smartthings.com

1 Like

Wish I’d know about that page sooner. I would have realized the timezone issues I was experiencing yesterday weren’t just my problem. :grinning:

1 Like

First rule of SmartThings: if something that used to work stops working, it’s probably not just your house. :wink: