Z-Wave Network Repair crashes

As reported earlier by @Kristopher, Z-Wave network repair crashes every time I attempt to run a repair. This is new behavior (for me), apparently since one of the recent platform updates. Is anyone else seeing this? My hub reboots during every repair, and never finishes the repair.

Are you trying from the mobile app or IDE? I just ran one from the IDE, no problems.

In the IDE 20 chars …

Are you a V1 or V2 hub? I’m V1

I’ve been able to do a few repair now since last platform upgrade. Did one this morning after adding a couple of new light switches. Repaired when fine without an issue.

i’m on V2 hub.

Hey @bravenel

In the IDE can you list all of the Hub Events:

https://graph.api.smartthings.com/hub//events

And paste the full list of events from the repair through the restart? I originally suspected it was a bad device on the network that was causing issues, but I have not confirmed it.

I routinely watch that feed during a repair. I haven’t seen anything about a radio error, or anything untoward at all. Just, bam, Hub had disconnected.

If it’s OK - I’d still be curious to see what happens up until the restart. Do you see Err 101? Also, when the hub restarts, do you see anything like a failure to route to a node? Lastly, it’s probably worth trying the repair while you have the Z-wave Debug on. Just open this URL:

https://graph.api.smartthings.com/hub/zwaveDebugOn/

My repair keeps crashing, too and actually worse… It disconnects some of my devices. I had support check and they said I have some “ghost” devices in my network. These are not associated to any real devices. These ghosts caused the repair to crash. They did not find a way to remove them. One solution was to reset and rebuild the network.

Hi @pizzinini – I thought the same about mine as well. I think that may have started the issue, but it seems that removing those devices (tons of attempted repairs, excludes, etc eventually worked) does not cure the issue.

Hi @bravenel,

I had the same issue on hub v1, and I noticed in your other posts on this thread that you are as well. Way back this summer when my issue came up, I opened a support ticket with SmartThings. It never ever got resolved, and I could not do a repair ever again. The issue first started when a zwave repair took almost 3 hours, and then finally it just stopped working.

I think the problem is rooted in all the ghost devices I ended up having, and how I would nuke devices when I first started with ST instead of properly excluding them (my theory). Over time, I think all that activity just built up, and with no Zwave utilities from SmartThings to help diagnose, map, and debug/tweak mesh issues, there was no way of preventing this from happening in my opinion.

Since moving to hub v2, I’ve been very careful in how I manage zwave devices, and so far so good. I have just a handful more zwave devices on v2 than I had on v1, and those are pretty much all battery powered. All the “routing capable” devices that existed on v1 are the same as on v2, and now my repair process only take 25-30 minutes on v2.

Moving to hub v2 for you may not be an option, but it may be your only option aside from factory resetting hub v1 and starting all over again. I do not believe ST will come up with a resolution for you and the others with this issue, or to ever come up with more zwave utilities for us to help manage our zwave mesh environment.

That link doesn’t work.

I just had a z-wave network repair finish successfully. Whatever was causing it to crash before didn’t this time. I unplugged the hub for a couple of minutes before doing the repair.

1 Like

The persistence of Ghost nodes, also called Phantom nodes, are a problem in Zwave. Nobody like to talk about it, but every controller has the issue. It comes from the fact that the Z wave controller assigns the network IDs, and therefore it does keep track of the devices for itself and these can be difficult to get rid of. Even if you do have good utilities.

The field engineer trick is to pound the ghost mode with requests over a daily cycle. Polling, battery status, whatever. But you need at least two different commands. This doesn’t mean a command every minute, but maybe send the same command repeated 5 times every 15 minutes. What you trying to do here is to get the controller and any nodes previously recorded as neighbors to recognize that this node is dead. Not just asleep. dead.

Then if you have good utilities, the next day, you do whatever the “remove dead mode” utility is (it varies from controller to controller), then run the Z wave repair, then reboot the controller, then run the Zwave repair again.

But a lot of times you can’t get rid of the dead node until the controller (and all its various internal back ups) is convinced that it’s really dead.

And even then, it may take a utility to do it.

The alternative, as John mention, is to reset the controller itself to factory and then rebuild everything which is a pain.

But a lot of times if you can just pump some message traffic to the dead node, that will clear out whatever internal backups the controller has been holding onto that cause it to keep repopulating but it takes both traffic and hours passing to get there.

So sometimes, yeah, you’ve been seeing one for a couple of weeks and it suddenly just clears.

Very annoying while it lasts, though.

Some people believe that if you run a regular network repair every night or every week it will reduce the number of persistent ghosts. But I’ve never seen anybody formally study that.

FWIW…

1 Like

@bravenel The link needs your HubID at the end to work.

2 Likes