Hub Backup woes

Following on from:

For the third day running a supposedly offline hub has triggered an automatic backup without telling me, let alone asking me first, leaving all my local VIRTUAL devices in an incorrect state and turning off monitoring on Todd’s LAN Monitor devices. And having spent hours last night reinstalling my Matter devices, every single one attached to my hub group (thread or Wi-Fi) now shows offline again, while all are still available to Google and to a standalone hub.

WTAF?

Oh and I’ve just been told that two lights turned on by themselves, only one of which could be turned off again by a button and automation.

Three Meross Matter over Wi-Fi plugs have now recovered by themselves after about fifty minutes. Correction: They were marked online but had the wrong state and couldn’t be controlled.

I’ll turn off the automatic fail over to backup for the moment as it seems to be more trouble than it is worth.

1 Like

Looking back at my notifications I see that my hubs are being reported as ‘disconnected’ and ‘connected’ rather than ‘offline’ and ‘online’ which might mean they are perceived as not communicating with each other. However all three of my hubs seem to have been considered disconnected at various times while acting as the primary, triggering a failover every time, so I am wondering what the algorithm is.

Still can’t understand why only the SmartThings hub group is struggling with its own thread network when other hubs and platforms are fine using it.

I suppose I ought to complete my monologue.

My ISP has been using ‘Wi-Fi Optimization’ for years, meaning if you use their supplied routers (in the UK it is common to be supplied with an integrated Wi-Fi/router/modem) they twiddle with the settings remotely from time to time, supposedly to give the best performance for their customers. The reality is that the last thing many customers want is their carefully thought out manual settings being splatted, especially when they have Zigbee and Thread networks to consider. Fortunately it was possible to request they desist with their well meant but ill-informed and damaging tweaks. Unfortunately it seems that they no longer allow this service to be switched off. Critically, it is pretty clear they have now gone further and turned it back on again. I believe this is the root cause of my problems. In my typical UK home signals don’t carry very far (a 5G Wi-Fi signal, for example, has a range of the order of eight yards/metres) and there are plenty of competing signals from neighbours. The last thing you need is a sudden change to your Wi-Fi channels being thrown into the mix. So I think what I was experiencing is a loss of connectivity between the three V3 hubs in my hub group.

I have disabled Wi-Fi on the ISP router so they can’t screw with it again and installed a new router in access point mode. I probably should have done this a long time ago.

A curious thing to me is that looking through my history reveals each of the three hubs having a spell as the Primary hub, and each having been automatically replaced because they were considered to be disconnected. I am unclear how this is identified. How is it determined that it is the Primary that is disconnected and not the Secondaries?

I was largely using three hubs in the group to compensate for the loss of Zigbee routing capacity as I replaced mains powered Zigbee devices with Matter over Thread. However I have also added a number of Sonoff ZBMicro USB smart plugs and I have seen those repeating for fourteen devices each so I probably don’t need to be quite so concerned about that. In the absence of an obvious replacement location for one of the hubs I have removed one hub from the group.

I thought that there was supposed to be a ten minute delay before an automatic hub backup took place, during which there was a notification about what was about to happen. Well I’ve never encountered that. All I have seen is a notification that the old Primary is disconnected at the same time as a new one has quietly taken over.

After a number of automatic backups the disconnected original Primary has been reported as connected again and it has been confidently stated that it would be reinstated in minutes. Only once have I seen this actually happen.

After my most recent automatic hub backups I have discovered that my local VIRTUAL devices no longer have the same states and also my Security mode, which depends on those states, has changed. This suggests that actual events may have been generated. My Edge drivers do generate synthetic events but only during the added lifecycle and I don’t expect that to occur. I also use Todd’s LAN Device Monitor driver and I notice those have their monitoring attribute turned to off after the backup. This isn’t good.

For more damaging is that after my most recent hub backups I have found that all my Matter devices are offline. However only to the hub group itself. Standalone hubs in the same Location with the same devices installed by multi-admin still see the devices as being online and can control them (except once they lost them after that hub was rebooted). The Matter devices were all absolutely fine via Google Home. These devices are all on the hub group’s Thread network so it means Google is happily using the TBRs on the hubs even though SmartThings isn’t.

Ideally it would be possible to add SmartThings hubs to existing Thread networks without creating a hub group. As things stand, the ability to have two or more hubs acting as TBRs on the network is by far the strongest selling point of hub groups and it is the sole reason I still have one.

The automatic hub backup (or failover) is a good selling point for hub groups but only if it works absolutely flawlessly. As things stand it appears to me that it is far more trouble than it is worth and I have completely disabled it.

3 Likes