Open/Close sensor doesn't register open/close in correct order


#1

Hello Everyone,

This is my first post, and I saw a similar discussion on this issue but no solution. Thought I’d provide a bit more background on my issue. I have quite a few open/close sensors, and every once in awhile when I open my door, it will obvious close again, and the server will receive the door closed signal before it receives the door open signal, so the history will show like this:

12:59 PM - Door Opened
12:58 PM - Door Closed
12:01 PM - Door Closed
12:01 PM - Door Opened

So now it will remain showing open…even though it is closed. We use our system for security purposes as well as home automation, so it’s disturbing that this system would not have simple checks like this to ensure the signals are not received in the incorrect order. Is there anything I can do to help this situation and prevent it from happening in the future?

Thanks for the help!

Ryan


(ActionTiles.com co-founder Terry @ActionTiles; GitHub: @cosmicpuppy) #2

Briefly, though we can discuss more details and work arounds, SmartThings has no way to control the order that messages are received. Devices have no internal synchronized clocks.

This is a good FAQ to start…

Also: Could you link / reference the discussion you already read? Thanks!

… Terry.


#3

Hi Terry, thanks for replying. Ironically I can’t find that thread now…but it literally didn’t provide any answers at all…the poster just said their open/close sensor wasn’t registering status updates.

I understand that the devices have no internal clocks, but I would expect a simple check to take place to see if a sensor previously registered close last, it is logical that the next status update would be open…not closed again. Also, my network is issue free, and yet smart things receives the status update in different order. I’d assume the hub received it in the correct order, and the network is hardwired at that point so just don’t understand the complication.

I’ve seen this a lot with Smart Things since I’ve gotten it…sometimes the open/close status NEVER registers. Just two nights ago, we had someone arrive home, opened the door, closed it, and literally it never registered that door being opened. Touting this system as a candidate for home security does very little in giving me faith that it would work when I need it to when it literally doesn’t register when a door is opened. For paying $600 for a relatively simple setup I expect it to work properly. Of course, it’s too late to return everything now.

Any suggestions? Surely I’m not the only person that has this issue?


(Brian Smith) #4

What sensors do you have? Are the ones having the issue all the same model?


(ActionTiles.com co-founder Terry @ActionTiles; GitHub: @cosmicpuppy) #5

Nope… you cannot assume that, unfortunately.
To the best of my knowledge, as well as described in the referenced other Topic, the protocols in use do not have any mechanism of guaranteeing order of delivery: @JDRoberts describes it better in A Brief Note About Scheduling).

Of course, in a simple configuration, we could hope the behavior is deterministic, though from a computer science perspective, computers simply do not run on “hope”.

I would think what you are observing from a single sensor is improbable if everything is functioning optimally. So there is value in debugging your particular case – but keep in mind that the problem is never fully gone.


#6

Terry linked to my note on scheduling, which explains the general problem, but when the issue is two messages from the same sender, which sounds like your issue, then it pretty much has to be that the messages are taking two different paths to the hub. Or that you’re losing a message in transit and it has to be resent.

(In a mesh network, messages are “issue once, try many,” so the device will try different paths or a total resend if needed. This cam cause messages to arrive out of sequence.)

The two easy things you can do is increase the signal strength and route options by adding a repeater device near the door, and “healing” the network to make sure the routing table is up to date.

Battery powered devices can’t act as repeaters, so you need to plug in the repeater. But pretty much any plugged in device will work as long as it matches the protocol of the one you want to improve. Zwave for zwave, zigbee for zigbee.

Once you’ve plugged in the repeater, it’s best to heal the network to repair the routing table.

REBUILDING THE ROUTING TABLE

Ideally, unplug the hub for at least 15 minutes to repair the zigbee mesh. Then restart the hub and use the zwave utilities to heal the zwave.

If you have over 50 devices and/or a lot of battery powered devices, some may be asleep when you do the repair. this has nothing to do with ST, it’s just how mesh protocols work.

So a lot of field techs will do a heal, wait 15 minutes for the table updates to propagate, then repeat the heal (you don’t need to reboot), wait another 15 minutes after the log shows the heal finished, then repeat the heal a third time, wait another 15 minutes. That should mean the entire routing table gets rebuilt including all the battery operated devices.

(Some field techs then unplug the hub and wait another 15 minutes and reboot, but I personally think that last step is just superstition. Can’t hurt, but I don’t think it helps either.)

So include the repeater, unplug hub for 15 minutes, reboot, heal, wait at least 15 minutes after log says the heal finished, heal again, again wait 15 minutes after log says heal completed, heal a third time.

You may never know exactly what was wrong, but this can greatly improve sequencing from a single device. The repeater improves signal strength and adds a new routing path at the same time.

(There’s another way to approach this that involves intentional bottlenecks to try to force the messages to go at least jam up on the same route, but I can’t recommend that. If you’re that desperate, better to just hardwire a direct connection.)

There’s also the possibility that there’s intermittent local interference causing some messages to get lost and have to be resent. But adding a repeater to increase the signal strength often solves that as well before you even get around to testing for it.

So If you can live with typical mesh issues, i’d add a repeater, rebuild the routing table, and reassess.

If you absolutely have to have almost perfect sequencing, I’d consider adding some hardwired devices.


(ActionTiles.com co-founder Terry @ActionTiles; GitHub: @cosmicpuppy) #7

JD: Do most (any?) sensors allow you to request a “refresh” (and retransmit) of the most current status? This could be used in situations where timing issues have been frequent to increase the confidence level in the current status (i.e., if it says “open” twice in a row, then it is “probably”, but still not 100%, really “open”).


#8

I need a new “headache” icon for “polling! Aaaaaaagh! :scream:

Battery operated sensors are intended to be dumb, low power draw, low network traffic, and pretty cheap.

Also…we assume all battery powered devices will be temporarily unavailable some of the time, if only to have the batteries changed. So the network intentionally doesn’t panic if some messages get lost.

Mesh networks like zigbee and zwave were designed to take advantage of those factors. Software utilizing those networks need to run lean and terse to keep from undoing the dumb, low power, low traffic, resiliency advantages of those devices.

So the short answer is No. If you need excellent synchronization, certainty about which device is or isn’t online, and longer packet size, you use a hardwired star or tree network and avoid any battery powered devices. And pay a lot more in dollars and energy use. Or you can get most of that for about half the cost but with less reliability with a WIfi star network (and very few battery powered devices). The WiFi network will still be much more expensive than the mesh.

But mesh is mesh. Not intended for forced sequencing.

Edited to add:

What you can do quite cheaply and retain the advantages of mesh topology is “zone” sensing, where you aim multiple sensors at the same area and let the "big brain"device that does real processing make decisions about what’s going on based on input from multiple sensors. This is how most professional security companies deploy outdoor motion sensors, for example. Remember, the whole premise is that the sensors are relatively cheap. But that takes some serious tactical deployment planning to get just what you want.


(Gary D) #9

I can’t help but to read this and think that the issue is being over complicated. I’ve seen the exact same issue myself with every single one of my contact sensors, while pairing/testing them about 4 feet (open air) from the hub. Sometimes, ST just “posts” the reports in the wrong order. I can see the contact sensor (via it’s LED indicator) registers the “open” and “close” events for itself, and I have no doubt that it’s sending the events as they happen.

With all the slowdowns and delays that ST has been experiencing in the past few weeks, I think this is more likely to be a ST issue and not issues of routing zigbee/z-wave routing, etc.

Consider this: ST most likely has multiple servers handling incoming z-wave messages from hubs. (If they don’t, it would explain all kinds of other issues.) They probably load balance or round-robin the packets. The hub see’s an “open” event from the contact sensor. It sends it to ST. ST routes it to server A. Then the hub see’s a “close” event… and sends it to ST. ST routes that one to server B. If server A is going a bit slow, it’s VERY possible that the events are being handled in the wrong order.

Perhaps @Ben or someone can chime in on this possibility. I’ve seen it myself quite frequently, and when the sensor is just a few feet from the hub (both paired from that distance and tested from that distance.)

Gary


#10

Gary,

It’s always possible. Same issue, just happening in the ST cloud rather than the house.

Other ST cloud issues make it very hard to test anything right now.


#11

Thanks everyone for your responses. I really appreciate the feedback. I will try JD’s suggestion and move the router/hub to be closer (They are within “Excellent” range as it is, so not sure if this is the issue)…I will do so to rule it out naturally, but I was wondering if something like Gary said is in fact the problem I’m running into.

I’ll post back after I move the router and update the routing table. Thanks so much!


(ActionTiles.com co-founder Terry @ActionTiles; GitHub: @cosmicpuppy) #12

I agree that there is definitely a possibility this is happening at the back-end.

Two questions arise:

  1. Regardless of whether or not the behavior is undesirable, is it or is it not “to spec”? I thought I read in the Docs that sequence / timing is not guaranteed, but can’t find that specific reference. It is hinted at, however:

    http://docs.smartthings.com/en/latest/introduction/smartthings-concepts.html

    So this is “barely” relevant (or, mostly irrelevant).

If you needed to know if a command was executed, you can subscribe to an event triggered by the command you executed and check its timestamp to ensure it fired after you told it to. You will, however, still have latency issues to take into consideration, so it’s impossible to know the exact current status at any given time.

  1. Since the behavior is undesirable, can it be fixed by timestamping messages at the Hub before they are sent to the load balancer?

#13

You might be thinking of the Scheduling discussion at the bottom of this page in the docs:

http://docs.smartthings.com/en/latest/smartapp-developers-guide/scheduling.html

Short answer: probably not. The time the message arrives at the hub doesn’t tell you the time the message was sent from the sensor. (Just as lights blinking on the sensor don’t tell you the time it will arrive at the hub.)

The same problem can occur in the house or the cloud.

Longer answer: strict sequencing is not what mesh networks are designed for. If you need that, use WiFi or hardwire.


(Gary D) #14

Perhaps with v2… (you know… the hub… of course you knew that, but I had to type more characters…)


#15

It’s been announced that the v2 hub will have bluetooth capability, but we don’t know which bluetooth. If it’s only for blutooth mesh, that’s still a mesh network, all the same sequencing issues apply.

But IF–and I emphasize that this hasn’t been announced, so we don’t know–ST supports regular BLE, that’s a point to point topology. Not mesh. So your phone or remote or hub directly tells the light to go off, it goes off. No passing messages around, nothing gets out of synch.

The main problem with BLE is it has a very limited range, about 30 feet. Blue tooth mesh extends the range, but we go back to passing messages.

Still…there may be some options for secondary controllers specifically for security options, which could enforce sequencing for certain specific devices, even if it takes the hub a little while to catch up.

We’ll just have to see what the future brings. :blush:


(Gary D) #16

Huh? I’m not sure what you’re responding to, but you quoted me suggesting that v2 will timestamp events at the hub before they are sent to a server (with the added implication that it wouldn’t matter as much as the v2 hub should be handling the processing itself.)

I’m not sure how that relates to bluetooth of any flavor.


#17

It relates because you have to get the status messages from the devices to the hub before they can be time stamped. And action requests at the hub, time stamped or not, still have to get back to the individual devices to be performed.

currently the network topologies that ST mostly uses between the devices and the hub are both mesh–zigbee and zwave. Mesh networks just aren’t designed for situations that require sequencing events in a specific order. So the hub time stamping stuff that’s coming to or from devices via zigbee or zwave doesn’t solve the problem, because the messages in both directions are bouncing around the network before arriving. The Closed status could still arrive at the hub before the Open status.

However, Bluetooth is not usually a mesh network. (Neither is WiFi.) Bluetooth is point to point, two devices talking directly to each other, not passing messages around first.

So IF the v2 hub provides standard Bluetooth support, then it becomes very easy to sequence action requests, including status checks. But only for devices within about 25 to 30 feet of the hub.

The kicker is that Bluetooth mesh has just been announced and approved by the Bluetooth SIG. And the first device available using it will be a smart bulb from Samsung. Bluetooth mesh would let you cover a whole house, just like zigbee or zwave–but at the cost of doing so with a mesh network, not point to point.

So maybe the Bluetooth antenna announced for v2 hub is really meant to support a mesh network of Samsung bulbs and similar devices from other manufacturers.

We just don’t know yet.

But traditional Bluetooth represents a network topology where it’s very easy to sequence events. (Events include status notifications from sensors.) Mesh networks by design make it very difficult to sequence individual events without giving up the benefits which are the reasons you deploy mesh in the first place.

(The features of a mesh network: inexpensive, low energy, pretty dumb devices sending very few, very brief messages and a network that doesn’t panic if any particular node is temporarily unavailable. That leads to some important advantages for home automation: Batteries last a long time, devices are relatively cheap and run cool, the network is resilient and uncluttered.)

It’s easy in a mesh network to make things happen at about the same time, but hard to force the exact order.

Point to point Bluetooth, which may arrive with the V2 hub, makes it easy to force an exact order, but only for devices very close together.

So that’s why the V2 hub’s Bluetooth may be relevant for sequencing.

Hope that helps.


(Bruce) #18

This makes me wonder about what, if anything, we can count on vis a vis sequencing of events or commands over mesh networks.

Example: An app sends a setLevel command to a z-wave dimmer, and then x milliseconds later sends an off command to the same dimmer. Is there a value for x at which we can be certain of the order the commands are delivered to the device? Is there anything we can do to minimize x?

My suspicion is that no matter how large we make x, we can’t count on the sequence not failing (device receives off before setLevel).


(ActionTiles.com co-founder Terry @ActionTiles; GitHub: @cosmicpuppy) #19

I wonder if there is another “layer” in the networking stack that could help improve the sequencing issue (e.g., like TCP layers over something more broadcast like UDP: TCP provides delivery guarantees, UDP does not; yet both have valid use cases).

There are thousands of papers and patents on mesh networking out there – The problem we are observing has most certainly been analyzed in great detail.

I think, at the most basic, that this is a probability cloud scenario. The larger the value of x that you wait between messages, then the higher the probability is that the messages will arrive in the sent sequence. Indeed then, what minimum value of x will ensure 99% probability; what about 99.999%?


#20

It’s not a problem, it’s a feature. :wink:

Mesh networks are not intended for any application which requires strict sequencing. That is not their purpose, and indeed would defeat their primary purpose which is to provide inexpensive low power draw network resiliency in an installation where almost any device may be temporarily unavailable. There are no critical nodes except the hub. That’s the point of mesh. The message will probably get through regardless of which nodes have gone offline. Cheaply.

Critical sequencing requires one of three solutions: a) smarter end nodes (more cost, more power draw), b) bottlenecks (less resiliency) or (wait for it…) a different topology. Not mesh.

If you need critical sequencing, don’t use a mesh network. It really is that simple.