Z-Wave Secure Join Failures and Other Misadventures

After a rather disastrous weekend with SmartThings, I wanted to share my experience with Z-Wave Secure Join. I’m posting this here because after a discussion with support that was completely fruitless I’ve begun to believe that the ST staff doesn’t believe there’s a problem with their z-wave routines, despite all evidence to the contrary.

I received a new Schlage Connect deadbolt direct from the manufacturer, firmware version is 7.1. From the beginning SmartThings’s inclusion routines was problematic.

First, one thing that I learned the hard way is that the ST hub will lie to you. It will act as if an inclusion was successful even if secure join failed. For these deadbolts, the functionality is extremely limited if it cannot successfully complete the secure join, you really can’t control the lock. A non-secure join is not a successful workaround. I’ve only found two ways to determine if secure join was successful:

  1. Look for the Security Class code listed in the device’s list of capabilities.
  2. Within the IDE select the hub, and then select “Events” and look for the inclusion events. This log only looks like it goes back 200 entries, so be sure to check immediately after adding a device. Why these log entries aren’t visible in Live Logging I don’t know, but at least I do know that I have to check multiple places to figure out just what happened.

Instead of going through a narrative let me just notate all the problems I experienced. Most of these I was able to get past by power cycling the hub in between each attempt. Crude, but out of everything I tried this was the most consistent for ensuring success.

  1. Brought hub to within 10 feet of the new lock. Power cycled the hub, ensured it had mains power and good network connection.
  2. Put the hub into Inclusion mode. Sometimes this would work (meaning the green light would flash and there would be entries in the Events log) and sometimes it wouldn’t (nothing would happen).
  3. When the hub would go into inclusion mode, put the lock into inclusion mode. The first time I tried this the hub would say the device had been added, but going into the Events log the secure inclusion would show as failed.
  4. Power Cycle hub
  5. Exclude the lock. Sometimes this would work and the lock would be removed. Sometimes it would not and I would have to try it again.
  6. Power Cycle hub
  7. After repeating steps 2-6 multiple times I would manage the get the lock included in the network with secure join showing as success.

This was where I was at Saturday night. The lock was present, the lock manager smartapp was managing it, SmartThings had it in the room, I added it to some routines I have, things were good. Or so I thought.

I left Live Logging open, and sometime around 5am on Sunday morning the lock just disappeared. No errors. No log entries. Nothing. It was just gone. It was missing from the system entirely, as if it never existed. SmartThings support knew absolutely nothing and could provide no useful information.

Of course the lock still thought it was part of a z-wave network, so I had to do a general device exclusion to get it out. Then I had the honor of repeating the above steps AGAIN, running into all the same problems along the way. It remains to be seen whether the lock will actually stay, but if there’s a way to mess it up I’m sure the hub will find a way.

It gets even more bizarre. I have a secondary controller from another vendor just so I can get useful z-wave information. For a lot of the attempts that the hub said had failed, it allocated a z-wave device ID, and there were a bunch of these phantom devices in the system. So whatever ST is doing to the z-wave network seems to have some deficiencies in the routine that cause it to break in incoherent ways.

It’s this instability that is moving me further away from SmartThings. I’ve had the hub coming up on one year in September and while there was a brief period of stability in the Spring it’s been getting progressively worse again. I can’t count on it handling a secure join correctly without double checking it, which makes me think I need to go back through and verify the other secure devices I have are showing up correctly.

Even worse is that now I can’t even count on a device that IS successfully added to remain connected to the system.

Support is terrible when they do respond, and when they don’t the excuse is “well Mondays is a busy day” as if that explains everything.

I do like the platform’s UI overall but after dealing with stability and z-wave join issues for this long I’m not sure I can continue any longer with this hub and maintain the peace in the household.

If anyone has any ideas on what I could be trying differently I’m all ears.

1 Like

Does this sound familiar to you?

No, I haven’t seen this on my Aeon outlets, so far they seem to be working fine at the moment. I had one that would not report status or voltage, and I could not control through the app manually but worked just fine as part of a Smart Lighting routine.

About a month ago I tore down my entire ST and z-wave setup and rebuilt everything from scratch, and added the Aeon outlets with a secure join. That seemed to take care of the device that wasn’t reporting and wasn’t controllable. Maybe the z-wave join process is so bad in ST (or in general, if I’m being honest I can’t say for sure that the z-wave standard itself isn’t the problem) that you really must verify it manually before assuming that it was successful.

You may have rebuilt before it broke?! I have sensors that work just fine, but they were joined a long time ago. I have thought to kill one of them and attempt to join them again, now, just to prove my point, but I don’t feel that adventurous if they work. I will continue to monitor. But from what I see more and more people have issues with secure join.

Maybe @slagle can look into this deeper, if he finds some time…

It’s certainly possible. My original setup ran from September until July so I would have thought I’d see it but perhaps not. Yes it was a fun process running around rejoining everything, especially when you’d have a device fail and the hub would get stupid, so I would start all over from the beginning.

Even better was when I deleted my location ST created the new one on a different shard, but left the older devicetypes and smartapps on the first shard. So sometimes I would log in and see my old setup but without a hub or location, and other times I would see the new setup. I had to have them manually clear out the old stuff.

One thing that has helped me is that my secondary controller has the ability to fully optimize the network, which appears to be a more robust version of the ST’s “Repair Z-Wave Network” that takes a lot longer to run but seems to have better results. Being able to use that has helped with reliability when sending commands to devices, but of course doesn’t help at all with Secure Join.

I agree that Secure Join seems to be a terrible implementation in its current incarnation. It’s just not consistent - same conditions, same devices, same distance between hub and device, and you get different results doing the exact same task. That’s a big problem.

1 Like

Just for clarification on the link @SBDOBRESCU posted, my Aeon outlets would work fine, it was just any devices ‘down stream’ of them that would randomly fail, including my Schlage door locks

Ever since the 8/15 update, I’ve been having all kinds of problems. First it was a bunch of Zigbee devices and now my Z-wave devices are acting up. I have a Home Energy Monitor (HEM) that suddenly stopped reporting. I went through the usual process of resetting and repairing and it would start working but then inexplicably, it would stop working again after a few hours. I left it alone and then this morning, it came ‘alive’ again and gave me a reading but then immediately stopped again after just one report. I try to do a Z-wave exclude and the hub never reports anything, it just keeps spinning and looking. I still have similar problems with a z-wave switch that refuses to connect, a Zigbee contact sensor that seems to work but doesn’t report anything to the hub, motion sensors that just stop working suddenly or gets stuck on motion. All very frustrating.

Then, on the flipside, I have other devices that work just fine with no problems at all.

This worked for me.

How to fix a Schlage lock that will not pair. The lock will pair if to far away from the HUB, but not work. It will say that there is a security violation. The following procedure must be done exactly to work. It took me more that a few hours to find what worked. The reset lock command will not reset the Z-Wave part of the lock, a exclude must be done. Some times the remove lock from the app will work but not all ways.

The lock needs to be installed in the door. Do a reset by removing the battery and holding down on the Schlage button on the out side of the door and connect the battery. The lock will blink twice. Enter a default unlock code as read from back of the lock or book. The lock will go thought a calibration. Extend the bolt on the lock by pressing the Schlage button on the outside of lock. Bring the Hub within 1 foot of the lock ( I used a long network cable). Go the the z-wave utility in the app. Issue a general exclude. Enter the programming code on the lock then go to exclude mode by entering 0 in the lock. The light on lock will blink faster. Magic will occur. Make sure that the green check make appears. The app will say one device excluded.

It will pair now if the hub is within a foot of the lock. Mine have worked for a year no problems.

1 Like

I appreciate the point of view, but there should be little difference with distance at that scale, particularly when it comes to radio wave propagation. I measured, I was actually closer to 5ft away, but if the difference between 1ft and 5ft apart with direct line-of-sight free of obstructions then we’re talking about an extremely week transceiver.

It might be my fault for burying the lede, but the real issue for me is not why secure join sucks so bad with SmartThings, although that is important. When you get down to it I was able to get it to successfully join after many attempts.

What really bothered me was the device disappearing 12 hours later at 5 in the morning when everyone was asleep. That shouldn’t happen ever.

The zwave specification allows for what is called “whisper distance” for the initial exchange of the encryption key with a security device. It’s up to each manufacturer to implement this for their specific devices, but I think for almost all of them 5 feet would be treated very differently than 2 feet. That’s an intentional security measure.

A number of people have reported That the Schlage locks treat whisper distance a little bit differently than Yale and Kwikset.

If you attempt to pair a Yale lock outside of its whisper distance, The join will fail all together and you won’t see the lock on your network.

If you attempt to pair a Schlage lock outside of its whisper distance, The secure join will fail but sometimes the lock will go ahead and pair insecurely which leads to the really frustrating situation where it shows up as a device on your network but it won’t respond to any of the commands because it doesn’t have the encryption key. So you have to exclude it and try again.

With any zwave lock there can be yet a different problem which is that the secure join goes fine but there is no beaming repeater close enough to the lock to get the commands through when the lock is sleeping. But that’s an entirely separate issue.

Most Schlage lock problems at the time of initial pairing appear to be due to the whisper distance fail where the lock joins insecurely but then doesn’t have the encryption key. That’s why people say these locks are fussy about pairing. And why it’s almost always worth trying to get within 2 feet of the lock to do the pairing.

I agree absolutely that a device should not disappear after it has been successfully paired. I can imagine a situation where that might happen under the SmartThings architecture because of the fact that there are dual copies of the device lists, one in the cloud and one on the hub. But just because it might happen doesn’t mean it should. FWIW.

This is a good point, and very well could be a problem. I’ve seen insecure joins with other devices fail at five feet or less, so I’m not quite ready to chalk it up to a design feature.

Still doesn’t explain why a properly joined device disappeared the next morning.

1 Like