Many zigbee devices randomly go offline, network very unstable

I am using approximately 50 zigbee devices in my home. Most of them are smart lamps that only go on/off, but also quite some other devices.
Out of these devices:

  • The smart bulbs/lamps randomly go offline. every few days, one or two lamps turn offline, not always the same lamps. i need to cut the power and give it back in order to put them online again.
  • often the whole network crashes. no device responds anymore, and sometimes a command i spammed to try to make it work, suddenly is executed many times, as if the hub v3 is executing a backlog of commands suddenly.
  • sometimes i have to restart devices to make them work (they dont show offline like the smart lamps) and sometimes i have to restart the hub v3.

Do more people experience these issues? I am getting really fed up with the stability of my smart home lately, so would love some help to understand why this is happening, and perhaps some tips to solve it.
The zigbee network should not be too busy, as i have 3 solar-panel plugs that stop spam-reporting when it’s dark. and the issues still persist when it’s dark. But i have no idea what the issue is.

Thanks!

2 Likes

Firmware 48.x is being pushed out this week. It may provide help your zigbee devices if you have a v2 or v3 or Aeotec hub.

I have V2 hubs in two different locations, one production firmware 47.11 the other beta 48.03, neither has issues with zigbee as you described. Nothing has dropped off, no crashes or reboots required. Zigbee network seems responsive and stable on both.

strange…
Getting really frustrated from the “ghost in the house”.
Today, frient motion sensors do not see motion anymore according to smartthings. Then after walking around 5 times, the fifth time it works, but it does not trigger to turn the lights on, even if the second condition of maximum amount of lux is met.

My philips outdoor hue sensors with the custom edge driver always go offline after a few hours if i re-add them.

Issues non-stop. it is very tiring as we are using smart home for convience, not for constantly fixing this sh**…

Lastly, still having the issue of some power reading outlet plugs after days or weeks suddenly reading 0,1w instead of 100w(so it’s working to report it, it simply suddenly added a 100x increase or decrease in all reported values)
i can resolve this last problem by unplugging it, then waiting and plugging it back in. a off/on does not fix it.

2 Likes

I’m on 48.3 and have started to have zigbee issues. My zigbee devices were rock solid for quite a while but l now seem to always have one sonoff relay always off line, and all of my Aqara FP1’s are unresponsive as well as some of my water sensors. I also recently received an out of memory message (V2) which has not occurred for quite a while now. Except for the sonoff all my issues are with end devices.

1 Like

What can i do to increase stability, find out what device or driver is causing this or other tips?

I am going crazy. The whole day in the evening my network was working but around when the sun went down, nothing was responsive. One motion sensor is now still stuck on “motion detected” even though i left the room 3 minutes ago. Lamps and devices were not responding for around 30 minutes, and are now slowly becoming responsive again.

Also, every day a random zigbee light goes offline and its annoying i have to power cycle it for it to work.

Help would be really appriciated.

1 Like

I would leave a logcat window running (via the CLI) on your hub for all drivers. It will be noisy/chatty but you may be able to figure out what’s going on. Perhaps drivers are being killed/restarted due to low memory. Perhaps there is some other communication issue or timeouts. Perhaps you have a device or device(s) that are spamming the crap out of Zigbee or Zwave networks.

Is there a way to store all outputs of logcat into a txt file or something?
Can you guide me please how to log into the hub and use the logcat command? It’s been a while.

Thanks!

smartthings edge:drivers:logcat --hub-address=x.x.x.x --all

See GitHub - SmartThingsCommunity/smartthings-cli: Command-line Interface for the SmartThings APIs. for more information.

If you specify everything on the commandline, the CLI won’t need any input and you can redirect all output to a file with the “>” operator.

smartthings edge:drivers:logcat --hub-address=x.x.x.x --all > log.txt

2 Likes

Use the API Browser to view your zigbee device details, Many of mine showed the provisioning status as Non functional or Not Provisioned. I removed them and onboarded the devices again. Some off line end devices came back online while I was doing this. I have also noticed that I am seeing fewer “this device has not updated” messages and performance appears snappier
It’s only been an hour since I have finished but things have settled down a little bit.

The provisioning status isn’t reflective of much. If you change drivers, it may not get set correctly. Lots of the community drivers don’t set it correctly if they override certain functions.

I had reported this issue before. Provisioning State Provisioned VS Nonfunctional - #7 by mlchelp

I think its a red herring to other issues that rejoining may have solved.

Have you checked channel interference?

Maybe worth checking Zigbee/Wi-Fi channel overlap, as well as multiple Zigbee hubs (Hue and ST for ex.) are not running on the same channel(s). Ensure that all Zigbee and Wi-Fi (in and around your home) network channels are as well spaced as possible to minimise interference.

I have found this can contribute to the symptoms you are describing, as can having Zigbee devices close to active Wi-Fi (streamers streaming for ex.) devices, regardless of channels.

A quick Google will give information on which channels correspond across protocols.

1 Like

It appears that you are probably correct 7 devices are off line this morning. Many of the devices that had the provisioning statement were converted to edge without removing them, so the statement was correct. The devices did all work. I believe that it is a firmware issue as it all started with the hub reboot after 48.3 was installed.

So i have to wait for a firmware update to solve the issues?
Then sad that they are not testing these updates on homes with lots of devices.

This evening, my presense sensors kept being stuck on “no movement/presense detected”, even though i was there, so as i was relaxing from a intense day, all e27 zigbee bulbs, around 6 outlet zigbee pluts and the tv turned off.
So it’s super annoying and weird how these presense sensors were stuck, but smartthings executed the routine for no presense just fine…however in a other room, a presense sensor rightly reports no motion, but smartthings is not executing the routine to turn the lights off.

I’m thinking of creating a youtube video about the ghost in the house. How routines are executed correctly and then at a random time are suddenly not triggered. It’s comical from a outside perspective how poor my smart home is functioning right now. Depressive and very frustrating as well, though.

It has become completely unreliable, at completely random times but at least a few times per day.
Devices also randomly go offline, usually until i power cycle them, and some simple stay offline until i repair them.

Edit: is it possible to downgrade firmware of the hub?

nope, once its there its there.

Do you have any Sonoff 01MINIZB relays? I have some and they seem to be the only Zigbee repeater devices that are going offline, all the others are end devices.

1 Like

Nope. I have many zigbee bulbs of different brands, and lots of outlet plugs. The outlet plugs never go offline, but sometimes simply dont respond. But whwn they dont respond the rest of the home on zigbee is also not responsive.
I am soon going to start testing for errors with logcat. Hope its possible to find the error with that.

EDIT: never mind i got it working, it’s now logging everything.
Huge files though with gigantic output. 1 minute is ~400kb of text data.

What in the heck type of devices are logging so much activity? Do you have power reporting devices reporting too frequently? That will make your mesh non responsive.

With about 100 devices my logging is pretty quiet.