Xpcall not working in Edge

I am attempting to use xpcall inside of an edge driver. It is not working as expected. pcall seems to work fine. Xpcall produces the message error in error handling

Pcall Code

local returned, data = pcall(function()
    x("This will error")
end)

print(returned, data)

Pcall Local

false    .../smartthings/tempCodeRunnerFile.lua:2: attempt to call a nil value (global 'x')

Pcall Hub

false   [string "test.lua"]:34: attempt to call a nil value (global 'x')

Xpcall Code

local returned, data = xpcall(function()
    x("This will error")
end, function(err)
    print("Error: ", err)
end)

print(returned, data)

Xpcall Local

Error: 	.../smartthings/tempCodeRunnerFile.lua:2: attempt to call a nil value (global 'x')
false	nil

Xpcall Hub

false    error in error handling

Hi, @blueyetisoftware, so sorry for the delay. I’ll ask the team about this and let you know.

1 Like

Thanks. I appreciate it. I know there are a lot of questions. Feels like I am beta testing the SDK

1 Like

Hi, @blueyetisoftware
Following up, there’s a lot of theory of why this doesn’t work, so, the short version is that printing the error as shown in your implementation is causing the issue.

If it’s highly important for you to print it in the logs, a potential workaround could be instead of printing the message, you could use cosock.channel. The “send”/“tx” side of the cosock channel doesn’t yield, only the receiver does. So you could :send(err) from the xpcall error handler and receive it elsewhere in a coroutine so that you can safely call print from a coroutine environment.

Which errors are you trying to catch using xpcall?

I discovered it while testing the EventSource implementation in your Hue driver. It allows you to create callbacks to handle errors and messages (onerror, onmessage, onopen). If the user’s callback code has an issue, it will crash the event source, since the event source is calling the users code. This crashes the driver, and causes the driver to restart. I have my own EventSource implementation with similar callbacks. In my implementation, I wrapped the callbacks with xpcall so that I could safely call them. In my error handler for xpcall, I log the error.

Edit: can we safely call coxpcall?

Hey Blue

First of all, thank you for reporting this, there appears to be a breakdown in our ergonomics/documentation.

The limitation here is that Lua specifies a few places where it isn’t safe to use coroutine.yield, this includes metamethods on tables and the error callback provided to xpcall. I at this point the Lua asynchronous model isn’t doing us any favors since we don’t have a clear indication what calls are going to call yield. As Nayely described, calling print on our platform ends up calling yield which unfortunately wasn’t clear, as illustrated by your test examples.

can we safely call coxpcall?

The answer here is unfortunately “it depends”, since coxpcall doesn’t provide any co-routine execution in its error handler (like it does for the first callback argument), it is only safe if you can avoid any calls to yield. We are going to explore how we can make this experience better for developers internally for a future release but in the mean time it might be easier to define your own implementation of xpcall that executes the handler in side of the cosock context.

local myxpcall(f, err, ...)
  local results = table.pack(pcall(f, ...))
  -- calling f was successful
  if results[1] == true then
    return table.unpack(results, 2)
  end
  -- calling f was unsuccessful and the handler is invalid
  if type(err) ~= "function" then
    return table.unpack(results, 2)
  end
  -- calling f was unsuccessful and the handler is valid, call the handler
  return err(table.unpack(results, 2))
end

Since the execution here fully moves out of the lua error handling back into our coroutine, we should be safe in calling the provided error handling function.

To be clear, the xpcall semantics are not the only place that this can crop up. For example, if we try and call print from a metamethod, we end up with a somewhat more helpful error message

local T = {}
T.__index = T
T.__gc = function(t)
    print("cleaning up T:", t.id)
end
function T.new()
    return setmetatable({id = math.random()}, T)
end


function make_and_drop()
    local t = T.new()
end

for i=1, 10 do
    make_and_drop()
end

print(collectgarbage("collect"))
> lua ./meta-method-problem.lua
lua: error in __gc metamethod (attempt to yield from outside a coroutine)

I hope that has answered your questions

Thank you for the thorough answer. It also explains why pcall is fine, since an error handler is not executed.