[ST Edge] - Fatal error from socket.http

Fatal error from socket

This error can occur when using http.request from socket.http. It doesn’t occur at the point of the call, but rather some random time later after an http request fails - most probably due to timeout.

FATAL Web Requestor S  Lua: runtime error: [string "socket"]:205: received data on non-existent socket: tcp_send_ready, 733a34de-e03d-4859-98e7-016e14f5c97b
stack traceback:
	[C]: in function 'error'
	[string "socket"]:205: in upvalue 'socket_receive'
	[string "socket"]:1461: in field 'select'
	[string "cosock"]:303: in field 'run'
	[string "st.driver"]:745: in method 'run'
	[string "init.lua"]:445: in main chunk

If data is received on a nonexistent socket, I would think it should not cause a fatal error, but rather should just be ignored.

Hey, @TAustin

Thanks for highlighting this issue!

To have a reference on the configuration of the request, can you please provide the implementation snippet of the socket.http module?

Hi, @TAustin.
About your comment here:

Did you share with Erick the implementation snippet as he requested in the post above?

I don’t think I ever did! But then again I don’t know what he’s wanting… this?

local cosock = require "cosock"
local socket = require "cosock.socket"

The error happens randomly and no way for me to tie it to a particular line in my code. (as I recall, line 445 is the driver:run() method

Missed the http module; I used to use this:

local http = cosock.asyncify "socket.http"

but have since switched to the more conventional:

local http = require "socket.http"

And I set a 3 second timeout on the socket:

http.TIMEOUT = 3

So, it doesn’t happen all the time? Perhaps you can share the code by sending us an email to build@smartthings.com so we can check it. The engineering team also needs a reference to see if the errors are related.
About the release of the 0.41 version, there’s no ETA yet.

No, this one is quite random and I have no way to consistently recreate it. I’ve seen it happen some time after an http request times out because the receiver isn’t responding. It most likely seems to happen when data is received after a timeout and after the http socket has been closed. If that happens, the data should be ignored rather than throwing a fatal error and stopping the driver. No matter what the cause, the engineering team should look at the code that is throwing this error and simply handle the case better by just throwing away the data and showing a warning message.

This is a good reference to check the issue. Ok, I’ll try to replicate it but I’ll report it either way.
It would be very helpful that you shared with me more details (over DM for security) like driver Id and the last timestamp you saw the error.

OK, next time I see it I’ll provide more info if I can. I have other people testing my Roku driver, and this is one of the reasons why they are finding that it keeps hanging up. (this, and the other socket bug that Eric said would be fixed). Because of these two socket-related bugs, I’m having to suspend work on on my Roku driver until these problems are fixed, because they make LAN drivers too unreliable for general use right now.

Thanks for sharing this, I’ll add it to the report I made and I’ll let you know their feedback :smiley:

I had a LAN driver crash and spontaneously restart in the following scenario:

Following 2 HTTP requests that resulted in timeouts (and concluded normally), a third HTTP request was then sent to an IP that exists and a port that doesn’t exist. As expected, it returned a ‘connection refused’ and my function appeared to complete normally. However about 11 seconds later, I got the following error:

WARN WLED Driver received data on non-existent socket: tcp_send_ready, bb07f167-e9c2-418d-9160-16bde5257fb4

This mesage shows some progress, since this error used to be fatal and would halt the driver, so thank you for addressing that one. Unfortunately this was immediately followed by a new fatal error I haven’t seen before:

FATAL WLED Driver Lua: runtime error: [string “socket”]:1469: socket_receive returned no error and no socket

And this error caused the driver to spontaneously restart. Fortunately the restart was successful and the driver seemed to continue normally.


Firmware version: 000.041.00005
Here is the log showing the sequence:

2021-12-29T22:45:27.886579515+00:00 DEBUG WLED Driver  Sending:	http://192.168.100.5:80/win&T=0&SB=255&HU=62986&SA=87&FX=8&RV=0&SS=0&SV=1
2021-12-29T22:45:30.906887264+00:00 INFO WLED Driver  response code=<[string "socket"]:1407: timeout>, status=<nil>
2021-12-29T22:45:30.918603972+00:00 WARN WLED Driver  HTTP request timed out: 	192.168.100.5:80
2021-12-29T22:45:30.922071431+00:00 ERROR WLED Driver  HTTP Request failed:	Timeout
2021-12-29T22:45:30.925643847+00:00 DEBUG WLED Driver  WLED Driver device thread event handled
2021-12-29T22:46:27.913005854+00:00 TRACE WLED Driver  Received event with handler capability
2021-12-29T22:46:27.936463396+00:00 INFO WLED Driver  <Device: 9b05641e-372d-4020-b390-7ac7f48049cd (WLED Driver)> received command: {"args":[],"capability":"switch","positional_args":[],"command":"on","component":"main"}
2021-12-29T22:46:27.940122104+00:00 TRACE WLED Driver  Found CapabilityCommandDispatcher handler in thisDriver
2021-12-29T22:46:27.946436396+00:00 INFO WLED Driver  Switch pressed
2021-12-29T22:46:27.968262479+00:00 INFO WLED Driver  <Device: 9b05641e-372d-4020-b390-7ac7f48049cd (WLED Driver)> emitting event: {"component_id":"main","attribute_id":"level","capability_id":"switchLevel","state":{"value":100}}
2021-12-29T22:46:27.980454146+00:00 INFO WLED Driver  <Device: 9b05641e-372d-4020-b390-7ac7f48049cd (WLED Driver)> emitting event: {"component_id":"main","attribute_id":"switch","capability_id":"switch","state":{"value":"on"}}
2021-12-29T22:46:28.000097187+00:00 INFO WLED Driver  COMMAND STRING TO SEND:	/win&T=1&SB=255&HU=62986&SA=87&FX=8&RV=0&SS=0&SV=1
2021-12-29T22:46:28.003677187+00:00 DEBUG WLED Driver  Sending:	http://192.168.100.5:80/win&T=1&SB=255&HU=62986&SA=87&FX=8&RV=0&SS=0&SV=1
2021-12-29T22:46:31.033101896+00:00 INFO WLED Driver  response code=<[string "socket"]:1407: timeout>, status=<nil>
2021-12-29T22:46:31.044338521+00:00 WARN WLED Driver  HTTP request timed out: 	192.168.100.5:80
2021-12-29T22:46:31.047928271+00:00 ERROR WLED Driver  HTTP Request failed:	Timeout
2021-12-29T22:46:31.051516729+00:00 DEBUG WLED Driver  WLED Driver device thread event handled
2021-12-29T22:46:58.392212733+00:00 TRACE WLED Driver  Received event with handler device_lifecycle
2021-12-29T22:46:58.399295608+00:00 INFO WLED Driver  <Device: 9b05641e-372d-4020-b390-7ac7f48049cd (WLED Driver)> received lifecycle event: infoChanged
2021-12-29T22:46:58.419217066+00:00 TRACE WLED Driver  Found DeviceLifecycleDispatcher handler in thisDriver
2021-12-29T22:46:58.425574691+00:00 DEBUG WLED Driver  Info changed handler invoked
2021-12-29T22:46:58.428947649+00:00 DEBUG WLED Driver  Time since last info_changed:	100.65799999237
2021-12-29T22:46:58.432145191+00:00 INFO WLED Driver  Host address changed to: 	192.168.1.5:8099
2021-12-29T22:46:58.438726066+00:00 DEBUG WLED Driver  WLED Driver device thread event handled
2021-12-29T22:47:03.078251150+00:00 TRACE WLED Driver  Received event with handler capability
2021-12-29T22:47:03.087437608+00:00 INFO WLED Driver  <Device: 9b05641e-372d-4020-b390-7ac7f48049cd (WLED Driver)> received command: {"args":[],"capability":"switch","positional_args":[],"command":"off","component":"main"}
2021-12-29T22:47:03.094667150+00:00 TRACE WLED Driver  Found CapabilityCommandDispatcher handler in thisDriver
2021-12-29T22:47:03.098185525+00:00 INFO WLED Driver  Switch pressed
2021-12-29T22:47:03.107869858+00:00 INFO WLED Driver  <Device: 9b05641e-372d-4020-b390-7ac7f48049cd (WLED Driver)> emitting event: {"component_id":"main","attribute_id":"switch","capability_id":"switch","state":{"value":"off"}}
2021-12-29T22:47:03.128335942+00:00 INFO WLED Driver  COMMAND STRING TO SEND:	/win&T=0&SB=255&HU=62986&SA=87&FX=8&RV=0&SS=0&SV=1
2021-12-29T22:47:03.131831400+00:00 DEBUG WLED Driver  Sending:	http://192.168.1.5:8099/win&T=0&SB=255&HU=62986&SA=87&FX=8&RV=0&SS=0&SV=1
2021-12-29T22:47:06.166198150+00:00 INFO WLED Driver  response code=<[string "socket"]:1407: timeout>, status=<nil>
2021-12-29T22:47:06.169904484+00:00 WARN WLED Driver  HTTP request timed out: 	192.168.1.5:8099
2021-12-29T22:47:06.173388817+00:00 ERROR WLED Driver  HTTP Request failed:	Timeout
2021-12-29T22:47:06.176894734+00:00 DEBUG WLED Driver  WLED Driver device thread event handled
2021-12-29T22:47:25.174158486+00:00 TRACE WLED Driver  Received event with handler device_lifecycle
2021-12-29T22:47:25.184917236+00:00 INFO WLED Driver  <Device: 9b05641e-372d-4020-b390-7ac7f48049cd (WLED Driver)> received lifecycle event: infoChanged
2021-12-29T22:47:25.218468361+00:00 TRACE WLED Driver  Found DeviceLifecycleDispatcher handler in thisDriver
2021-12-29T22:47:25.221785778+00:00 DEBUG WLED Driver  Info changed handler invoked
2021-12-29T22:47:25.225088694+00:00 DEBUG WLED Driver  Time since last info_changed:	26.796000003815
2021-12-29T22:47:25.231790861+00:00 INFO WLED Driver  Host address changed to: 	192.168.1.104:8099
2021-12-29T22:47:25.238544778+00:00 DEBUG WLED Driver  WLED Driver device thread event handled
2021-12-29T22:47:28.875946361+00:00 TRACE WLED Driver  Received event with handler capability
2021-12-29T22:47:28.891121778+00:00 INFO WLED Driver  <Device: 9b05641e-372d-4020-b390-7ac7f48049cd (WLED Driver)> received command: {"args":[],"capability":"switch","positional_args":[],"command":"on","component":"main"}
2021-12-29T22:47:28.894673570+00:00 TRACE WLED Driver  Found CapabilityCommandDispatcher handler in thisDriver
2021-12-29T22:47:28.906070778+00:00 INFO WLED Driver  Switch pressed
2021-12-29T22:47:28.911042528+00:00 INFO WLED Driver  <Device: 9b05641e-372d-4020-b390-7ac7f48049cd (WLED Driver)> emitting event: {"component_id":"main","attribute_id":"level","capability_id":"switchLevel","state":{"value":100}}
2021-12-29T22:47:28.922092570+00:00 INFO WLED Driver  <Device: 9b05641e-372d-4020-b390-7ac7f48049cd (WLED Driver)> emitting event: {"component_id":"main","attribute_id":"switch","capability_id":"switch","state":{"value":"on"}}
2021-12-29T22:47:28.941473236+00:00 INFO WLED Driver  COMMAND STRING TO SEND:	/win&T=1&SB=255&HU=62986&SA=87&FX=8&RV=0&SS=0&SV=1
2021-12-29T22:47:28.947904778+00:00 DEBUG WLED Driver  Sending:	http://192.168.1.104:8099/win&T=1&SB=255&HU=62986&SA=87&FX=8&RV=0&SS=0&SV=1
2021-12-29T22:47:28.975895861+00:00 INFO WLED Driver  response code=<[string "socket"]:1407: connect failed (192.168.1.104:8099): Os { code: 111, kind: ConnectionRefused, message: "Connection refused" }>, status=<nil>
2021-12-29T22:47:28.988597945+00:00 WARN WLED Driver  Connection refused: 	192.168.1.104:8099
2021-12-29T22:47:28.992287903+00:00 ERROR WLED Driver  HTTP Request failed:	Refused
2021-12-29T22:47:28.995861903+00:00 DEBUG WLED Driver  WLED Driver device thread event handled
2021-12-29T22:47:40.461879196+00:00 WARN WLED Driver  received data on non-existent socket: tcp_send_ready, bb07f167-e9c2-418d-9160-16bde5257fb4
2021-12-29T22:47:40.473654279+00:00 FATAL WLED Driver  Lua: runtime error: [string "socket"]:1469: socket_receive returned no error and no socket
stack traceback:
	[string "socket"]:1469: in field 'select'
	[string "runner/src/packagesearcher.rs:21:10"]:303: in field 'run'
	[string "runner/src/packagesearcher.rs:21:10"]:743: in method 'run'
	[string "init.lua"]:462: in main chunk

An additional thought: I seem to recall that these type of errors often show up after HTTP timeouts (sent request to an address that doesn’t exist). It seems something doesn’t get cleaned up correctly in the socket code and eventually causes this issue.

Hi, Todd.

Thank you for sharing your findings, is this the same driver you shared last time?
In the report I made, the engineering team mentioned there was a fix to show the error message instead of dropping the driver and another for the root cause. I will add your comments there to get some feedback about the progress.

No, different one.

Yes, looks like they handled the error with just a warning now and try to continue, but there are still down-stream problems.