Slow Z-Wave Device Response to Commands

I have sent this to Support but my guess is I’ll get nothing constructive back. Posting here just to see if maybe others have the same issue. I waited several weeks to see if this was just a passing problem - I have seen slow responses to commands before but hub reboots or simply waiting usually saw things clear up. But this problem has been going for some time and is fairly consistent so I decided it was time to report.


For several weeks my Z-Wave devices have been erratic in their response times to commands from with the SmartThings Classic app and via Alexa.

If I issue a command via the ST app or via Alexa, the applicable device can take up to several minutes to respond (turn on, off, etc). If I then issue another command to that same device fairly soon after the device reacts almost immediately. Over time the problem will return for that device. This appears to be consistent across many devices although I have not checked for every one (I have many Z-Wave devices).

I see a similar issue when devices are being sent commands by the Smart Lighting app, A command will get issued and it may be a minute later before the device responds.

I ran a Z-Wave repair today to see if that would help but it has not. The Z-Wave repair had just 4 failures, none of which indicate a bad device:

* Plant Shelf: Failed to Update Mesh Info
* Game Room Overhead: Failed to Update Mesh Info
* Beverage Fridge Outlet: Failed to Update Mesh Info
* Family Room Fan: Failed to Update Mesh Info

Can you look at my set-up and see what may be the issue causing the very slow response to commands.

My Zigbee devices do not have the same issue.


If anyone has any troubleshooting steps they can think of I’d appreciate it.

Does this include devices running locally? And if so, does disconnecting from the Internet change the observed behavior?

Do you have any energy reporting zwave devices? If so, how often do they report?

Do you have two Aeotec Energy reporting devices within 1 hop of each other?

Do you have anything doing polling, refreshes, or battery level checks for any of your zwave devices?

What are the last 3 zwave devices you added to your set up and when did you add them?

  • Yes, many running locally. I will try the disconnect idea when I get the opportunity.
  • Yes. They report at varying intervals but I have not changed any intervals in over a year. The whole home HEM is every 15 seconds. I have about 5 other Z-Wave devices that report power no more frequently than every minute.
  • I have 3 Aeotec devices that do energy reporting, yes. Not sure how I would know about the hop.
  • I poll two devices every minute and always have. These are LAN devices representing my receiver zones. I have a webCoRE piston that checks battery status every 12 hours and is also triggered if a battery level changes on a device.
  • GE Outdoor Outlet on 11/28/18. Stock DTH.
  • GE Switch on 10/18/18 (it had dropped off my Device list so this was added after doing an exclude first*). Stock DTH.
  • GE Switch on 09/28/18 (it had dropped off my Device list so this was added after doing an exclude first). Stock DTH.

* More on this … I have had about 4 occurrences of this over a year. It’s not the issue of the device still showing but becoming unresponsive (the failing GE device thing), rather the device simply disappears from my SmartThings. Support have never been able to explain why.

@Nezmo

Are you able to run zwave repair to the point where there are no errors? If you run zwave repair 4 times in a row, can you get a clean zwave repair?

I have to run zwave repair 4 times in a row to get a clean zwave repair.

I will give Z-Wave repair more attempts shortly. For me it take anywhere from 30 to 45 minutes to complete so I won’t have any quick answers.

Thanks.

@Nezmo
Do you have a V1 hub or v2 hub?

I have a V2 hub with around 95 devices and a repair takes 8 minutes.

V2. I have about 90 Z-wave devices.

Second Z-Wave repair run just now.

Started: 2019-01-31 3:56:25 PM
Finished: 2019-01-31 4:29:34 PM

Four different devices failed to update mesh info this time. No other errors.

I do recall that the repair is not recommended when you have many devices so who knows.

I have had this issue for 2 weeks at the end of December; ST supports was really no help as all they wanted to do was reset my hub and start again! All 100+ devices.
Magically it started to respond in a timely fashion the first week in Jan and work as per usual until 2 weeks ago when every Z-wave device went 1-10 minutes response time again.
These are running on a V2 hub locally. I have tried excluding and adding, changing device drivers, Z-wave repair, Z-wave repeater etc. but no change.
All other devices work fine. Motion detectors register immediately and ActionTiles updates with the light turning on…5 minutes later the light really comes on.
I have noticed if you try to turn off a Z-wave light in the ST app sometimes the blue button will go out and other times it just stays blue no matter how many times you press it.
This has to be a software issue with the hub and the z-wave firmware but ST seems unable to dig deeper. They must have software tools to trace command interactions. I work for a telecom and we can trace everything to the byte.
The really strange thing is after I put the house to sleep at night (and wait for all the light to go out) the next command is the morning when I trigger the bathroom light to come on and change mode to “morning”. This piston makes the bathroom light come on every time without any delay; 10 minutes later after a shower I walk down the hall and everything is back to dead.
It’s like the hub cache can only handle 1 command every 5 minutes and then things back up.

PS. The other strange thing is all the Z-wave devices periodically all go offline in device health and return later on.

Much of what you describe daven is what I am seeing too.

I feel this has to be something screwed-up at the DB level in the cloud. But what I cannot understand is how my truly local stuff also has the problem. As JD suggested, I will try pulling the Interwebs plug and testing. I just need to pick a good time on that test as if I do it at the wrong time my wife will beat me up :wink:.

1 Like

I can honestly say confusion reigns.
I got up this morning and all the Z-Wave device are working perfectly.
It’s a great bug, 2 weeks on and 2 weeks off.

I wish I could say the same.

And of course … zero response from Support so far. When it does come I am sure I’ll be asked to reset all of my 215 devices. Not going to happen.

I would most definitely tone down the HEM reporting. Try as big of a delay you can live with but not every 15 seconds! I have very strong suspicions that my energy reporting devices are causing zwave mesh network issues.

Also, try to turn off Device Health. I haven’t used it in a long time because when it was on, I frequently had devices becoming unavailable even though they were fine. Maybe a coincidence but life has been way better without. Once it’s solid I will try it again.

Another thing is to review your webcore pistons. It is very possible that you are over stressing the mesh network. Start with any pistons that do things recursively or involving large numbers of devices.

I hear you but I have changed nothing on this front in over a year. I can’t see how this could be the issue unless something on the back end at ST has changed.

I don’t have an issue with things showing offline. My problem is response times to commands for devices that show online.

This makes me wonder. My webCoRE dashboard takes FOREVER to load. I put it down to issues just with the webCoRE server because response times vary by time of day and for some time many of us could not get our webCoRE dashboards to even load. Mine loads now but it’s terribly slow much of the time. I’ve only added one or two new pistons in the last month.

But I think the dashboard loading and any load pistons might be putting on my network are likely two different issues. Or are they? Maybe I’ve reached some threshold where I need to break up my webCoRE in to different instances? This could be tricky to figure out.

Then again, my webCoRE dashboard taking ages to load could be a symptom of issues with my Z-Wave network rather than the cause …

How much has your network grown this past year? Think number of zwave devices, energy reporting devices, and any automation you introduced. The frequwncy of reporting might have been fine at first but may be causing problems now.

My network grew to a couple hundred devices and lots more automation. Mesh network congestion has become a big problem for me.

As for webcore dashboard performance… Strange. Mine is typically fast to load. Performance of loading the dashboard and running the pistons should be unrelated. Try to pause suspect pistons to see if things improve.

Hard to quantify but maybe 10-15% as a guess.

I do get your point and I will do some digging around this.

We think alike. I’ve already started trying this.


iI really appreciate all the suggestions and I will keep looking. I do think though that there is something wrong at the cloud level. Time will tell.

The symptom of devices disappearing all together is really weird, and not likely Zwave. So I don’t know what to say about that.

It is common that residential setups have a lot more devices going on and off during winter’s cold and dark weather, which also means a lot more energy reporting if it is set to report any changes.

If two aeotec energy reporting devices attempt to use each other as repeaters, there have been a number of reports in the forum that they basically flood each other and choke. I haven’t tested it myself, but there have been some people who have solved a lot of zwave problems, particularly local lag, just by removing those devices so they only have one.

@tgauchat has some more data on what energy reporting devices do to network traffic.

But none of that would explain the devices disappearing from the account altogether.

Yes, I think that the issue of them disappearing is another issue altogether and not related to the very slow response to commands I’m trying to resolve here.

My Aeotecs (3 smart switches) have never been great. I have never pinned down if it’s the devices or DTHs I’m using but they have never been real responsive to any commands. I may try using stock DTHs to eliminate that but if the routing is really an issue I may just have to trash them. Peanuts (Zigbee) are more reliable, a quarter of the price and quite frankly the power reporting of the Aeotecs and DTHs I’m using is unreliable so I would miss nothing with the Peanuts anyway.

Thanks for the info JD. I had not read about the routing issues with Aeotec energy reporting devices.

1 Like

Have you tried restarting the hub (power cycle and not just “reboot”)? That often helps. If you’re using cloud DTH’s/SA’s rebooting the router/modem also helps reduce latency. If the platform itself is running slow (or your shard is running slow) then possibly report it to support.

1 Like

Not recently. I’ll give that a try, thanks.

I did report to Support this week so we’ll see. No response yet.

2 Likes