Edge Driver HTTP Close when Call HTTP Request

I understand, the team mentioned that the fragmentation of the packet shouldn’t cause the issue and if you could replicate it locally, it was something else as you mentioned, they also suggested another test:

  • Split the HTTP head into two messages manually to see the behavior, just make sure the CRLFs are in place, for example:
local socket = require "cosock.socket"
local s = socket.tcp()
s:connect("0.0.0.0", 8080)
s:send("GET / HTTP/1.1\r\n")
s:send("Host: 0.0.0.0:8080\r\n")
s:send("\r\n")

Thank you for providing more info that can help

Thanks Nayely. If the team says packet fragments are not a problem, then I won’t pursue that path. I believe! I’ve switched back to using socket.http in my driver since trying to do everything with TCP sockets was turning into a big chore!

I’ll have to keep trying different things to see if I can figure out what the problem could be.

Thanks for your patience @TAustin, I think right now the distinction is that how the bytes for the HTTP request are sent should not cause a problem for any server that is not making bad assumptions about how TCP works. It is entirely possible that a server might make a bad assumption that the first read of a chunk of bytes on a socket will contain a full set of headers as well as the start line which is definitely not guaranteed the case and even if a client performs a send/write to the OS of these bytes all at the same time (there are many layers of buffering, parameters impacting the MTU for TCP chunks, etc.).

As mentioned by @nayelyz if the server is buggy in this way (and the root cause here isn’t some other issue) the fallback is probably to try to do the socket calls directly to have a better chance of chunking in the way that the server expects.

Hopefully that makes sense; we generally are trying to provide a very consistent implementation of APIs like luasocket for consistency and ease of testing but in this case it seems like it is possible that the timing of adjacent calls to send chunks via TCP through the additional layers involved with the sandboxed execution environment may result in slightly different behavior in terms of the segments emitted at the transport layer.

1 Like

Hi Paul - Thanks very much for the response on this. I am still struggling with this problem, but here’s what I know so far:

  • I don’t think it’s a ‘chunking’ problem (although let’s be careful with that term since these HTTP requests do not include a body; we’re just talking about the startline and headers). The reason I don’t think it is, is because I have been able to capture tcpdumps of even a browser sending the data in separate TCP messages, and it still worked fine
  • I have bent over backwards to ensure that my headers are IDENTICAL to those sent by a browser. This is no easy task in Lua, as the socket.http module sticks in a ‘TE’ header, and you normally have no control over the ORDER in which the headers are sent (thank-you Lua tables :roll_eyes:). Wondering if the order made any difference, I patched the socket.http module to brute force the headers, including the order. Still no joy.
  • All this testing is done off-hub using Lua5.3. So it’s not even really an Edge issue, it’s a Lua issue. I’ve also tried sending the same HTTP request from a Python script using the requests module, and IT works just fine. EDIT: I just tried using curl, it it works fine as well!

So at this point I’m at a loss. I have all the tcpdump logs from each of these scenarios, so perhaps someone with more expertise could look at them and pinpoint the issue. If you have anyone on staff with those abilities, I’ll package them up and send them to the build@smartthings.com account. I haven’t yet because I wanted to keep testing in hopes I’d eventually figure it out; and again, it’s not an Edge issue per se.

Sorry for late.
I’m doing awair omni device.

Out of curiosity, when you get the ‘closed’ responses, what is going on at the other end? Is it receiving and acting on the requests? Is there any difference between response messages sent when you get ‘closed’ and when things look more conventional?

Perhaps most importantly … Do things actually make sense if you don’t consider ‘closed’ as an error?

I think the original poster was getting closed errors. I’m getting no return at all from the device at times, so the socket receive times out. I’m beginning to think perhaps my problem is due to a bug in the device firmware.

Ah right. I did see you mention the [string “socket”]:1410: closed errors, but missed that you had quickly moved on from those. There might perhaps be a line of enquiry for closed errors.

There is a separate issue with devices that quickly close the connection after sending data. What happens is that if you have a listening socket server and you get the connection request, by the time you do a getpeername to find out the IP address of the sender, you get an error that the connection is closed. You can proceed to receive the data, but unless it has self-identifying info in it, you won’t even know who it’s from!

Ah right, so a different scenario to the original post. I shall have to read more carefully.

Were either of you able to ever resolve the HTTP error you were getting? It turns out now I have a similar problem as well. Although my error is slightly different:

[string “socket”]:1410: closed

So weirdly, I’m getting the error ‘[string “socket”]:1410: Permission Denied’ when sending ‘http://192.168.15.108/helloWorld’ and I’m at a complete loss. A few more particulars:

  • Confirmed I have 41.09 firmware
  • Tried with multiple GET end points (all open/non authenticated)
  • Works great via browser, postmates, or fidler

It seems I’m running up against something on the hub itself but I just don’t know how to simplify any more to find out what’s going on.

Thanks in advance,

Mark

http vs https? Or missing port number?

http and default port 80. Basically I’m just copying what’s around the internet (including your stuff…grin)

local http = require('socket.http')
local ltn12 = require('ltn12')

-- refresh handler
function command_handlers.refresh(driver, device)
  log.debug(string.format("[%s] Calling refresh", device.device_network_id))

  local success, data = command_handlers.send_lan_command(
    'http://192.168.15.108',
    'GET',
    'helloWorld')

    device:online()
end

------------------------
-- Send LAN HTTP Request
function command_handlers.send_lan_command(url, method, path, body)
  local dest_url = url..'/'..path
  --local dest_url = 'https://api.weather.gov/alerts/active?area=WA'
  local res_body = {}
  --local query = neturl.buildQuery(body or {})

  log.debug(string.format("Calling URL: [%s]", dest_url))

  -- HTTP Request
  local _, code = http.request({
    method=method,
    url=dest_url,
    sink=ltn12.sink.table(res_body),
    headers={
      ['Content-Type'] = 'application/x-www-urlencoded'
    }
  })

  log.debug(string.format("Returned HTTP Code: [%s]", code))

  -- Handle response
  if code == 200 then
    return true, res_body
  end
  return false, nil
end

It’s probably one of those super silly things, but I can’t track down docs for that error.

Cheers

Did you define your driver config yaml file with LAN permissions?

permissions:
    lan: {}
1 Like

That was it! I knew it was something silly… thanks so much.

1 Like

I am seeing these errors as well. It seems to be related to the load of requests. If I hit an API that returns a large amount of data, or make a bunch of calls very rapidly, I will see these errors. For now, I am simply retrying the calls when I see this. The retry rarely works, probably because it is adding to the load. This started showing up for me when I integrated the Hue api which can return thousands of lines of json when calling their API.

I’ve been able to use wireshark to trace these cases and from my instances it is one of two cases:

  1. I am sending data TO the device and it doesn’t like it for whatever reason and simply ignores the message and closes the socket

  2. The device is sending a message to me and immediately closes the socket as soon as it’s done transmitting, not waiting for any acknowledgement. With all the coroutine stuff going on in the hub, it seems that sometimes by the time my driver gets a chance to accept the connection and receive the data, the socket is already closed. The data is usually still available to read, but there is also a bug where getpeername fails so you can’t get the IP address of the sender (getting a socket doesn’t exist error - probably a separate issue).

To clarify, I am seeing it with http GET requests on the hub. To see this, I am not doing any socket work on my own. I have not been able replicate this with the curl or any other desktop GET requests against that same API.

Which HTTP library are you using? The stock one is indeed a bit flakey sometimes where a request can fail that, just like you said, works fine from a browser or curl (or python, etc.)

Here’s another library you might try: SmartThingsEdgeDrivers/http.lua at main · FreeMasen/SmartThingsEdgeDrivers · GitHub

Just the socket.http available on the hub. I’ll have to give it a shot.