Eliminating gratuitous network delays


#1

Short story: To avoid the overhead of opening an outbound TCP connection, a device handler is attempting to send UDP. The statement

socket = new DatagramSocket()
results in an error
java.lang.SecurityException: Creating new class java.net.DatagramSocket is not allowed

Of course, I’m not trying to create a new class, only a new instance of an existing class. The documentation has a general note “SmartThings allows only certain classes to be used in the sandbox. A SecurityException will be thrown if using a class that is not allowed.” but I don’t see why something as innocuous as UDP packets would be prohibited, nor why the error message would be misleading.

How can I eliminate or work around this error?

In detail:

I’m new to ST and am starting with only legacy devices. First app controls room lights with various motion detectors as inputs. Response time was unacceptably slow. Much of the delay was caused by overhead network activity. Here is a Wireshark capture taken on my bridge (an old netbook), before performance improvements:

As you can see, the bridge starts to send the motion alert at :02.110 (pkt. 5242) and doesn’t get the resulting command until :02.894 (pkt 5296), a delay of 784 milliseconds. However, the ST’s actual response time is only from pkt. 5264 to pkt. 5293, about 309 ms (incl. ~100 ms network time), i.e. there is 475 ms overhead! One of my locations (Bangkok) is ~300 ms ping time from Amazon Virginia, so eliminating this overhead is essential.

When presenting status changes, I was able to eliminate the DNS/TCP/TLS overhead by having the bridge open an SSL connection to graph.api.smartthings.com at initialization and keeping it alive by sending a “HEAD /” request once per minute. Then, when motion is detected, it can send the application data immediately. The server seems to accept pipelined requests, so this works even if a keep-alive was just sent and not yet acknowledged. I’ll provide details and/or code fragments (in perl) if anyone is interested.

However, I’m stumped trying to get a command sent without delay. It may be possible to have an API endpoint handler “hang”; when a command needs to be sent the handler would return it as “response” information. However, I don’t know how to do the control flow, or if it’s even possible with ST. I also thought about encoding the command in a DNS request, but it’s not clear how to keep the NS record for the subdomain cached at ST, given its propensity to use many servers for outbound requests.

All suggestions welcome. Thanks.


(sidjohn1) #2

This tread should point you in the right direction, it won’t kill the delay but should help you format a socket connection.


#3

[quote=“sidjohn1, post:2, topic:19679”]
This tread should point you in the right direction …
[/quote]Thanks for the link. It’s not directly applicable to my situation, as I don’t presently have a Hub. However, I’m guessing that the Hub continuously maintains an open SSL connection to the ST platform, capable of bidirectional communication. Is that correct? If so, then using SendHubCommand would likely eliminate the delay; the TCP opening handshake would only be between Hub and device, an insignificant (less than 1 ms) delay.

But that raises the question: Is the protocol between Hub and ST cloud public, and is independent use permitted, i.e. can I write my own “virtual Hub”? I presently own no Z-Wave equipment and my ZigBee devices already have gateways – a Hub seemed superfluous. Adding a physical Hub entails additional expense, power consumption, complexity, etc. And, given the major differences between “LAN” and “Cloud” device programming, it would likely add an additional point of failure – it would be quite a challenge to write everything to use the Hub when available, but remain fully functional (with reduced performance) if the Hub went down.


(sidjohn1) #4

Stewart, I know this may be hard to believe, but the sendhubcommand() does actually require a hub to work. If your trying to get some cloud to cloud stuff working w/o another hub you may want to look into xivley or IFTTT. If you have multiple zigbee ecosystems you maybe able to consolidate to 1 hub with SmartThings if the zigbee devices are supported and if not you can write your own devicetype and contribute to the long list of devices SmartThings supports.


#5

[quote=“sidjohn1, post:4, topic:19679”]… the sendhubcommand() does actually require a hub to work.
[/quote]That may be true, though I’d like to know why. If the Hub’s radios are not being used, it’s just a LAN-connected computer bridging one TCP/IP-based protocol to another. Why can’t I simply include those functions in a computer that is already in the system (and required – the Hub lacks USB, RS-232, X-10, etc.)?

I can think of three reasons: 1. SmartThings prohibits it. I wouldn’t like that, but would respect it and plan accordingly. 2. It’s too hard to find out how. For example, in the absence of documentation, if the Hub verifies the cloud’s certificate (and/or vice-versa) and the device can’t easily be “rooted”, it would be very difficult to capture and analyze the Hub-cloud communication. 3. Implementing the protocol would take more effort than I care to expend.

Regarding cloud-to-cloud stuff, I’m satisfied with the presently available API’s. If it takes an extra few seconds to notify me that the PBX is down, or if remote temperature readings are delayed by a few seconds, that doesn’t matter at all. However, fast responses to local sensors are important. If entering a room should turn the lights on, it’s frustrating to get to the middle of the room before they come on. My first attempt was awful – more than 2.5 seconds. See spreadsheet below:

Guided by this analysis, various improvements have reduced the time to ~1.25 seconds. To get under one second, I somehow have to get a command from the ST cloud to a local computer with only one network propagation delay. Any ideas?

Thanks.


(Scott) #6

Wouldn’t the answer be to process locally and cut out the cloud? Kind of like what’s coming in the V2 hub?


(Geko) #7

Yes, I believe it’s correct.

Is the protocol between Hub and ST cloud public, and is independent use permitted, i.e. can I write my own “virtual Hub”?

No, it’s not public, and no, you can’t.

Why can’t I simply include those functions in a computer that is already in the system (and required – the Hub lacks USB, RS-232, X-10, etc.)?

I guess because SmartThings is a business and has to make money somehow.

To get under one second, I somehow have to get a command from the ST cloud to a local computer with only one network propagation delay. Any ideas?

Use IoT platform that offers websockets or a pub/sub protocol like MQTT. Or if you run a local server anyway, handle all events locally.


(sidjohn1) #8

I guess w/o the hub you can’t see that I get sub second responses in my house. I open a door, the light turns on almost just as fast as if it were hard wired. Yes on occasion there is some lag but for the most part the responses are very quick and you’ll never know it was processed in the cloud.

Seriously think about this critically, if it were normal for a 1-2.5 second delay in response to actions made in the home the forum would be lit up like a Las Vegas night with complaints in regards to slow response. To say SmartThings is slow, with out having a hub is like climbing a hill and claiming the world is flat. Your not seeing the complete picture.

It sounds like from the tools you are comfortable with you’ve already spent way more than $99 of your time trying to develope a solution with out a hub. You may find it more cost effective to buy a hub and try it out for 30 days to see if SmartThings is for you. You can always return it if it doesn’t work out.


(Scott) #9

I for one like having the speed of my home automation determined by how many of my neighbors happen to be watching Netflix :smile:


#10

Many thanks for all the replies.

Sure, local control is best. As soon as V2 is available, I’ll buy one and if it works out, three more. I’d love to be a beta tester. I hope that V2 will connect to a wider variety of devices, without requiring another computer in the path. For starters, it should have a USB port supporting common devices such as serial ports, weather stations, X10 interfaces, IR send/receive, etc. IMO multiple USBs are unnecessary; USB hubs cost almost nothing and provide flexibility of device placement. It should be able to scrape web pages and post forms to almost any LAN device with an embedded web server. Speaking SSH and telnet would allow managing low level devices. Interface with common security panels and sensors would be nice.

Well, I enjoy doing this stuff, and it may help slow the decline of my 70+ year old brain. :smile:
Though I’d be willing to pony up $400 for hubs in four places, if the ‘right’ solution is hubless, I’d rather pay a fair price for a subscription to the cloud-based platform.

I indeed had noticed that. There are a few complaints, but they all lacked quantitative data (other than total response time), so I didn’t take them seriously. I now know that typical ST response time (at the cloud) is only ~200 ms. Of course, there is also some network delay, but I’m confident that ST will soon deploy in other Amazon regions. If they hit US West, EU and Singapore, I’ll have less than 50 ms everywhere. 250 ms response time is acceptable to me for nearly all applications.

One compelling aspect of hubless operation is the ability to implement things remotely. For example, our winter apartment in Bangkok is normally occupied December through February, but we sometimes run the air conditioner in summer, to minimize damage. Present control is via a crude command line script, which I plan to replace with a ST app. If a hub were needed, it could be a logistical nightmare. Asking a neighbor to install it wouldn’t be a big deal – just plug into an available Ethernet switch port and a power outlet. But, suppose it gets hung up in customs on regulatory or other bureaucratic nonsense, and she needs to “tip” someone to get it out, more hassle than I’d ask someone to put forth.


#11

[quote=“3one5, post:9, topic:19679, full:true”]
I for one like having the speed of my home automation determined by how many of my neighbors happen to be watching Netflix
[/quote]You shouldn’t see such an effect, even if your cable node (or other shared ISP infrastructure) is grossly oversubscribed.

I have IP phones in several locations. I don’t experience any voice quality issues, even during heavy traffic hours. It only takes ~30 ms jitter or a few tenths of 1% packet loss to noticeably degrade IP voice; this would not significantly affect a HA application.

Imagine a cable node serving 300 homes, 75 on each of four groups of four bonded channels. Each channel has a useful data rate of 30 Mbps, 480 Mbps total, which equals 1.6 Mbps per home. Further imagine that they all try to stream Netflix at the same time. Of course, it will re-buffer several times then fall back to standard def. The customers will not be happy. But, this won’t significantly degrade SmartThings. Further suppose that when ST sends you a packet, by Murphy’s law all 74 other users on your channel are ahead of you with maximum length TCP packets in the queue. Well, at 120 Mbps for the bonded group, a 12-kilobit packet takes only 100 microseconds to send. In 7.4 milliseconds, you get your turn. That’s not a noticeable delay.

Who is your ISP? Do you have issues with VoIP, gaming or other time-sensitive applications?


(Patrick Stuart [@pstuart]) #12

But with even just 1% packet loss and the tiny packets that HA is constantly sending. That means its gone and that could be a device status update. A request to turn on or off something.

Packet loss is a serious enemy to HA and IoT. They never built in any type of ecc into it. So send and forget. There is to transmission retry it IoT.

I am convinced hat most of the random misfires and instability in cloud based platforms that relay to local hubs is all related to ISP or home networking issues.

Most people don’t know they are double natted, or that their MTU size is wrong and using apple airports, etc.

ST could do a better job of testing for these conditions and warning people that packet loss or fragmentation of small packets can cause major issues with HA.

You might not notice it in Netflix or web surfing.


(Scott) #13

Sorry, I said this in jest. I have not seen any issues with delay in my system even though the network connection I have becomes heavily congested in the evenings. Even though I have not see issues I look forward to having the local processing capability in case I were to lose Internet connectivity.