FAQ: Diagnosing Indoor Network Reliability Problems

Continuing the discussion from Google IoT Platform: Project "Brillo", "Weave", Nest … Google I/O Conference:

I feel your pain, seriously. That sounds awful.

Several thoughts (I worked as a network engineer).

  1. if it was me, if none of the following help, then, yes, I would return everything I could. That much failure makes the system worse than useless to me.

  2. let’s start by seeing what does work. "All home automation is local. " sometimes it’s about the wallpaper on the walls, or the insulation inside the walls, and you just can’t get a signal from one room to the next.

So: how’s the Wi-Fi? Put your smart phone on Wi-Fi mode and just walk around the house and see what the bars do. If you using Wi-Fi range extenders, you probably already know there are architectural problems. But let’s just begin with something simple, a Wi-Fi survey of the house.

If, without range extenders, the Wi-Fi is patchy throughout the house, then it’s pretty likely that zwave/zigbee will be also. Although there are different radio frequencies that interfere, when it comes to drywall, foil backed insulation, water pipes, tinted glass, and other physical barriers, they all have similar drop offs, except WiFi is higher power than zigbee/zwave. So where the WiFi is dimmed, the zwave may be crippled.

If you do find a lot of dead or dim zones, the only option is repeaters in almost every room. It won’t matter what hub brand you choose, you’ll run into the same thing.

  1. for the next steps, begin by shutting off any smart app that is doing polling or refresh. We want minimal traffic on the net.

  2. if the Wi-Fi (without extenders) is fine, then we have to look at the device network that you’re using for the home automation.

Let’s start with the Cree bulbs.

Put one bulb in a lamp One room over from where the smartthings hub is.

Does it turn on and off with 100% reliability from the smartthings mobile app?

If not, but again there was no Wi-Fi issue in that room, then either the hub antenna is defective or, there’s a local interference issue.

I don’t want to go through all the things that might cause local interference in this post (see the repeater FAQ for more on that), let’s just note the issue for now.

But if you can’t get a single simple zigbee device to work one room over from the hub (not counting the presence fob, which has a separate set of problems), then you’ve got a lot of detective work to do. You’ll probably need to try changing your Wi-Fi channels, look for local interference like fluorescent lights, even borrow a Phillips hub from someone and see if you can get one of its bulbs to work in that room.

If zigbee is reliably controllable from the Phillips bridge, but your Cree bulb is not reliably controllable from smartthings, first try a different cree bulb. If that fails in the same way in the same location you may need exchange the ST hub for a different one and try that.

  1. if the Wi-Fi signal was okay, and you could get a single zigbee device to work reliably from the smartthings app when that device was one room over, it’s time to consider the network layout.

For this test, for technical reasons, let’s switch to a motion detector. I’m assuming it’s Zigbee. Put the motion detector one room over from the hub, you need to unplug the hub for 15 minutes, then reconnect. Wait another 15 minutes. This rebuilds all the address tables. Tedious but easy.

Now it’s a matter of walking that device away until it fails. Try 4 or 5 tests in each location.

As soon as you find the place where reliability drops off, leave the motion detector there. Unplug the hub for 15 minutes, wait another 15 minutes. Then retry.

If the detector works reliably now, fine, it just needed the new neighbors list. Continue the walk away procedure.

Once you get to where it fails and a heal doesn’t fix the problem you’ve reached a zigbee dead zone.

Check within 30 feet line of sight from where you are plus 15 feet through walls and ceilings. What non light bulb zigbee devices do you see? Any that plug in (not battery powered)? If not, put a device that repeats nearby, unplug for 15 minutes, wait for 15 minutes, retry.

If that fixes the reliability problem for that motion sensor, OK, you probably need to increase the ratio of repeaters to battery devices per room in your network. It might mean new devices, it might mean rearranging the ones you have. It might mean more effort/money than you’re willing to budget for a solution, in which case I’d go back to returning stuff.

If you can get a replicatable distance-related failure but adding a repeater and doing a heal doesn’t fix it, then you’ve likely got a local interference problem. Way harder to track down and fix.

(I know the OP is already familiar with what repeaters are, but for later readers who aren’t, also see the following repeater FAQ:

http://blog.smartthings.com/iot101/a-guide-to-wireless-range-repeaters/ )

  1. if everything worked perfectly in the device tests so far, turn back on the smartapps that did polling/refresh. If the motion sensor becomes unreliable now, it’s likely a traffic jam problem, you’ve overburdened the network with too much polling. easy to do, easy to fix by reducing the polling requests.

All of that was just to check the zigbee HA devices. You need to repeat the whole process for ZLL light bulbs and Zwave, as these three things don’t repeat for the others, only themselves. Also zwave doorlocks that use beaming need the last device before them to also support beaming.

But those are the typical diagnostic steps. In each case you only go to the next step when you have connection/control success with the previous

A) general RF issues as demonstrated by wifi without extenders
B) one room over connection using two different devices
C) controller substitution (philips bridge vs ST hub) for one room over test
D) distance to failure after network heal, followed by repeater addition and network heal
E) traffic check

Like I said, tedious, but pretty easy. We start with zigbee HA just because the heals are the easiest to do.

What next steps you take if you pinpoint a failure are up to you.

If you need an alternative network protocol, insteon over powerline gets around some common architectural issues, and WeMo over WiFi has possibilities although I’ve personally run into a high percentage of WeMo device failures, including one I bought this month.

Or you can wait for a viable bluetooth system.

Good luck, let us know what you decide, even if it’s just to stick with what you have.

3 Likes

Great write up JD!!!

Thanks for all the tips!

WiFi signal is pretty great in my house and even outside my property. Using the Netgear Nighthawk and the house isn’t too terrible large, so it manages to reach all corners of our land with a decent signal.

Maybe it is just a defective hub, or maybe it’s that it fails to communicate with SmartThings servers. I’m holding out on Hub V2 (and fingers crossed we don’t all get screwed on price) before I decide to just start selling all my ST specific sensors and try something else when I can afford it again.

In the Things menu, each bulb will work individually just fine… however when changing modes or turning on/off groups of lights is when 1 or 2 bulbs get left behind.

There’s a ST ZigBee motion sensor in every room of the house, and while those work they sometimes will fail… at least a couple times a day I’ll have to wave my hands around in a room like a crazy person. Sometimes the app says it detected motion (just no bulbs seem to care) and other times it won’t even report motion.

Really the most troubling things though are the presence sensors and the smart door lock. I’m convinced ST shouldn’t even be allowed to sell the presence sensors in their current state. Sitting in the same room as the Hub sometimes we’ll both mysteriously “leave” home and the door will lock, and seconds later we’ll arrive so the door unlocks. Sometimes in the middle of the night the front door decides it just wants to unlock and let the whole world in.

I’d say out of everything the lock functions the worst which is really scary. Automations sometimes work for the lock but what really only works <5% of the time is using the app to manually lock/unlock the door. It’ll just say “Locking…” for forever and never actually lock, or sometimes it’ll say “Locking…” and then “Locked” but never really did anything.

Really though, I’m hoping the V2 fixes this or I’ll just end up switching to something else. A smart security system should never make you more worried about your homes security than if you didn’t have it.

My Nest and any non-ST controlled “smart device” I own all function without any problems at all. I’m pretty sure all my problems are either from this hub being junk or ST infrastructure not being capable of functioning. I know way way way more people with ST issues than people who have a functioning ST setup.

If this stuff could still be returned to ST I’d send it all back in a heart beat.

OK, it’s very likely that your Nighthawk is killing your zigbee signals.

Yeah, I know: zigbee says wifi interference is no longer an issue with zigbee.

That’s low power WiFi. As soon as you amplify your WiFi, all the old interference issues come back in the 2.4 range. If you have a 5 ghz signal only, no problem–but most residential devices are dual band, and you get amped 2.4 even if you aren’t using it.

I have a Netgear WiFi extender to push signal into a bricked basement. Works great. But it interferes with my ZLL light bulbs and they lose connection. We had to spend two days testing exactly where to position the WiFi booster to keep it from tromping on the zigbee mesh.

ST doesn’t let you change the hub’s zigbee channel so you’re stuck with whatever it came with, which reduces some options. You can try changing the WiFi channel to see if that helps.

And make sure the WiFi router is at least 10 feet from the ST hub.

But presence sensor aside, your other zigbee devices might well run into the same issue regardless of the hub you’re using. Unfortunately. :disappointed:

I have a NightHawk router and I don’t have any of these issues.

How far from the hub is your lock?

“All home automation is local.” If you’re not getting WiFi/zigbee interference, that’s excellent. Sometimes it’s just a channel issue, sometimes it’s a reflection point. Sometimes it’s really hard to fix in a particular building.

At my house, it’s super obvious. I have 7 bulbs on the Hue bridge, scattered through 5 rooms. Plug the WiFi extender in on the East wall, three bulbs lose connection. Move the extender to the West wall, those bulbs come back, two different bulbs lose connection. Moved the extender one room over on the South wall, all bulbs connect fine. Reproducible every time.

Good point on the lock. I had similar issues with a zwave lock until I moved a beaming repeater into a right angle position to help get the signal around a corner to the lock. Status reliability instantly went from “rarely right” to “almost always right.”

Door locks are a pain. If you read the fine print, most z-wave repeaters will not repeat the signal for the door lock because it is encrypted.

And a second issue is “beaming,” as well as encryption. Beaming is a power saving protocol. It requires the repeater to hold the message and try again really soon when the lock wakes up. Many zwave repeaters don’t hold the message in this way, which is why you need to check the fine print for both beaming and encryption to reach a door lock.

edited to include @jodyalbritton 's correction that it could be beaming or encryption as a fine print issue.

Maybe, but here is the actual fine print for the aeotec repeater.

NOTE: The Aeotec Z-Wave range extender WILL NOT repeat the signal for encrypted devices such as door locks. To improve communication between your hub and a door lock, use a Gen5 smart switch or Z-Wave indoor siren instead.

Good point, it’s actually both beaming and encryption that are required. Thanks!

About 10 feet but on the other side of a wall…

Walls are really tricky. Water pipes inside the wall will greatly reduce signal strength, as will foil backed insulation. Some wallpapers block signal. Some drywall has mineral components that block signal. So what looks like a wood barrier in fact is a composite of many materials.

If it’s any of those physical issues, WiFi strength will also be reduced, so you can often diagnose it. But zigbee and zwave are very low power compared to WiFi, so the WiFi signal might be reduced but still get through while the zigbee signal dies inside the wall.

I would like it if SmartThings changed the software to display A LOT more information about the zwave and zigbee networks, like the routing tables, if devices are using encryption, generation of device and sdk protocol of devices.

Some other hubs do this.

I think it is very important to know if a device is using encryption or not, like for alarm sensor and barrier devices. It could be you just didn’t include the device correctly.

1 Like