[ST Edge] Devices stop reporting values after firmware update

I’m having random driver crashes after multiple days of running flawlessly, and I’m really scratching my head trying to debug these. I first noticed it on a LAN driver, where after running for anywhere from 2 to 7 or more days it would just stop and become non-responsive. When I set up a Pi to monitor logs for multiple days I found that it stopped in the middle of emitting a series of events (all identical to events generated a thousand times before by the driver) with nothing in the logs indicating a crash. It just stopped. The only way I was able to recover was with a hub reboot. This has happened a few times with that driver, though I only caught it once with the Pi running logcat.

I’m also running 3 Zooz 4-in-1 and 6 Q sensors on a lightly-modified version of the stock ST z-wave sensor driver. Sometime around 3am today they all stopped reporting, which I noticed around 1:30pm. I wasn’t actively logging that driver, so no clue if logs would have shown anything, but a hub reboot this afternoon immediately got them all reporting again. I’m pretty certain that I had the same problem with that driver a week or so ago, but at the time I only noticed it on the Qs and blamed it on them being newly added devices instead of a driver issue.

This is really painful to troubleshoot since it seems to be a random number of days before it happens. Seeing it happen on two drivers now though is making me think it’s something happening that’s outside of my control.

If your LAN-driver is using HTTPS, there is a known issue that the driver is “eating” sockets (file descriptors). You can read about it here: Edge HTTPS - Maximum number of sockets reached. BUG? - #2 by erickv

Did the 40.6 firmware not fix the problem then?

No, that one uses TCP.

Hi @philh30 -

Any chance that your hub rebooted? That would stop logging in its tracks making it look like the driver hung. And if by chance it did reboot, you can’t catch the driver re-start logging since there’s no way to get the logger restarted in time. Unfortunately the known cosock bug can wreak havoc on LAN driver restarts or updates. And the new firmware (000.041.00004) did not fix it.

1 Like

This is weird indeed.

If you already checked what the others mentioned, I can create a report for this issue, to do so, I need you to send me over DM all the driver and device IDs that present the issue.

Note: Both values can be found in the Devices list

Also, whenever you see the error again, you can register the time and share it with me so the engineering team can check the logs.