I believe I found a bug in the luncheon library that can cause a hang. Here is the setup:
‘Connection’ header is either ‘keep-alive’ or not defined (implicit keep-alive)
‘Content-Length’ header is missing
'Transfer-Encoding` is not chunked
function HttpMessage:body_type() will output a content type of ‘closed’
When function HttpMessage:fill_body() is called, it will assume the connection is closed and try to read all content. Note the comment in the existing code before it calls fill_closed_body:
function HttpMessage:fill_body()
...
elseif ty.type == "close" then
-- We only want to read until socket closing if the socket
-- will actually close, otherwise it will hang. The lack of
-- a content-length header is not enough of a clue as the
-- socket may be setup for keep-alive.
body, err, partial = self:fill_closed_body()
if not body then
self:_push_body(partial)
return err
end
else
...
end
I think the library needs to be patched with something like this. I included both raw code and a visual diff. I plan to create a platform patch in my driver, but would like confirmation from engineering that they also see this as a problem to hang when talking to a misbehaving device.
function HttpMessage:get_connection()
if not self._parsed_headers then
local err = HttpMessage.fill_headers(self, "headers")
if err then return nil, err end
end
if not self._connection then
local conn = self.headers:get_one("connection")
if not conn then
return
end
self._connection = conn:lower()
end
return self._connection
end
function HttpMessage:body_type()
local len, headers, enc, err
len, err = self:get_content_length()
if not len and err then
return nil, err
end
if len then
return {
type = "length",
length = len
}
end
headers, err = self:get_headers()
if not headers then
return nil, err
end
enc = headers:get_all("transfer_encoding")
local ty = "close"
if not enc then
-- Check for keep-alive without message boundary indication
if self:get_connection() ~= "close" then
return nil, "Protocol violation: keep-alive response without Content-Length or Transfer-Encoding"
end
return {
type = ty,
}
end
for _, v in ipairs(enc) do
if HttpMessage.includes_chunk_encoding(v) then
ty = "chunked"
break
end
end
-- If we have transfer encodings but none are chunked, check for keep-alive violation
if ty == "close" then
if self:get_connection() ~= "close" then
return nil, "Protocol violation: keep-alive response without Content-Length or Transfer-Encoding"
end
end
local trailers = false
if ty == "chunked" then
local trailer = headers:get_all("trailer")
trailers = trailer and #trailer > 0 and #trailer[1] > 0
end
return {
type = ty,
trailers = trailers
}
end
Code above is a proposed fix in the platform lib itself. Going with this for an actual monkey patch in the driver:
--- Compatibility patches for platform libraries
--- This module applies monkey patches to fix platform issues based on API version
local log = require "log"
--- Apply HTTP protocol compliance patches
local function apply_http_patches()
local HttpMessage = require "luncheon.http_message"
-- Add get_connection method if it doesn't exist
if not HttpMessage.get_connection then
function HttpMessage:get_connection()
if not self._parsed_headers then
local err = HttpMessage.fill_headers(self, "headers")
if err then return nil, err end
end
if not self._connection then
local conn = self.headers:get_one("connection")
if not conn then
return
end
self._connection = conn:lower()
end
return self._connection
end
end
-- Save original body_type method
local original_body_type = HttpMessage.body_type
-- Override body_type with protocol violation checks
function HttpMessage:body_type()
-- First call the original method to get the body type
local result, err = original_body_type(self)
if not result or err then
return nil, err
end
-- If the result is "close" type and connection is keep-alive (default in HTTP/1.1), it's a protocol violation
if result.type == "close" and self:get_connection() ~= "close" then
return nil, "Protocol violation: keep-alive response without Content-Length or Transfer-Encoding"
end
-- Otherwise return the original result
return result
end
log.info("Applied HTTP protocol compliance patches")
end
--- Apply all compatibility patches for current API version
local function apply_patches(api_version)
-- For now, we only have HTTP patches, but this can be extended
-- based on API version as needed
apply_http_patches()
end
return {
apply_patches = apply_patches,
apply_http_patches = apply_http_patches
}
Thank you for your well laid out investigation and request.
My conclusion is that the Luncheon library is handling this situation correctly. Rather than change the library, you can protect from it by setting a timeout on the underlying socket, so that when receiving the close delimited response a timeout occurs rather than a hang. Although, feel free to package whatever you want in your drivers, since it seems like your mitigation works for your situation.
Luncheon testing
I did some some testing with the server/client example implementations in the Luncheon repo to see how the client reacts to the situation, and was able to see the hang you mention; it is prevented with a socket timeout.
How its supposed to work
The RFC 9112 for HTTP/1.1 addresses this situation in section 6.3 with rule 8 for determining the HTTP message body length:
“Otherwise, this is a response message without a declared message body length, so the message body length is determined by the number of octets received prior to the server closing the connection.”
And it warns that this is unreliable and should not be done:
“Since there is no way to distinguish a successfully completed, close-delimited response message from a partially received message interrupted by network failure, a server SHOULD generate encoding or length-delimited messages whenever possible. The close-delimiting feature exists primarily for backwards compatibility with HTTP/1.0.”
Thank you for the thorough response. I see that logic as well and a timeout certainly mitigates it on my end. My suggestion was to try and bomb proof luncheon, but the spec does allow for reading the octets as you pointed out. Not sure what you could do with them, so my solution was to fail early. I’ll direct other suggestions directly in the repo as well to prevent needing to copy code back and forth. Thanks.
Looks like the main point of emphasis with luncheon’s handling is this. I assumed the code meant:
When in actually it is not assuming that is IS closed, just that it WILL close at some point.
To go back to my comments above though, unless connection=close, I still think we have an issue. If it is undefined (keep-alive by default) or keep-alive then we know it will not close for this request. A timeout will mitigate it, but we also know a close delimited stream isn’t possible given the connection header. I’m basically advocating for enforcement of this. It could apply to HTTP/1.1 specifically. I’m happy to offer a PR if you are open to it.
RFC 9112, Section 9.6: “A server SHOULD send the ‘close’ connection option in any HTTP/1.1 message that is immediately followed by a close of the connection.”
Implication: If a response is close-delimited (i.e., no Content-Length or Transfer-Encoding, so the body ends when the connection closes), the server should include Connection: close in the response headers. This signals to the client that the connection will close immediately after the message.
If it is undefined (keep-alive by default) or keep-alive then we know it will not close for this request.
This isnt true. The RFC allows the connection used in the request to be terminated with a close delimited response.
Regarding section 9.6, “A server MUST send the ‘close’ connection option in any HTTP/1.1 message that is immediately followed by a close of the connection.” I read and re-read that section, and do not see this. You’re implication is directly contradicted by the first part of that section which says: “A sender SHOULD send a Connection header field (Section 7.6.1 of [HTTP]) containing the “close” connection option when it intends to close a connection.” Not a MUST.
I think the most pertinent part about how keep alive works for this situation is section 9.3, which is explicit in that the client must read the whole message which is delimited by the socket closing:
In order to remain persistent, all messages on a connection need to have a self-defined message length (i.e., one not defined by closure of the connection), as described in Section 6. A server MUST read the entire request message body or close the connection after sending its response; otherwise, the remaining data on a persistent connection would be misinterpreted as the next request. Likewise, a client MUST read the entire response message body if it intends to reuse the same connection for a subsequent request.
Also, now that I have learned about these close delimited responses, I dont see how we can include your fix and still allow for close delimited responses to be received. The assumption that it will close is what the RFC specifies we should assume.
@carter.swedal Thanks for the correction and for taking the time to review. You’re right—I misread the RFC, and close-delimited responses are allowed on keep-alive connections with server closure. Appreciate the spec clarification!