Try_create_device: though device may be "added", not always "init"ed

rossetyler · January 12, 2022, 8:47pm

I am creating a Legrand RFLC edge driver for my SmartThings hub that will mDNS-discover neighboring Legrand LC7001 hubs on its LAN. For each LC7001 discovered, my LC7001 sub_driver handles its lifecycle init. Subsequent device discoveries will find each RF Switch and Dimmer that the LC7001 controls and request a device be created (try_create_device) for each of them. These are handled by my Switch and Dimmer sub_drivers.

Not all of these device creation requests are honored. That is, not all result in a device lifecycle “added” event. For those that are, I will see these devices in the SmartThings app. Unfortunately, even for those that are “added”, not all of them are "init"ed. My lifecycle init handling is where I device:emit_event the current device state. Since this is not done, all these devices in the app say “Checking…” and, since I have not accommodated for them, cannot be controlled.

The current documentation suggests that “added” is called only when first added but that “init” is called afterwards and always during device initialization. So, I am expecting to always handle a lifecycle init event but it does not always come. Why?

Also, in the SmartThings IDE, these don’t show up as “local” devices (Execution location is “cloud”). Why? Making these execute locally on the SmartThings hub is the whole point.

JDRoberts · January 12, 2022, 8:51pm

I can’t answer any of the other questions, but the IDE will be going away when the groovy cloud goes away and it has not been updated to work with edge drivers. So edge drivers just show up as “placeholders“ and their method of running local is not the same as the previous Architecture. So not being marked local in the IDE doesn’t have any meaning as far as edge drivers go.

philh30 · January 12, 2022, 8:59pm

Are you sure each device you’re trying to create is getting a unique device ID?

Init is also called whenever the driver restarts, including when the hub is rebooted or the driver is updated, so you have other options for querying the device for an initial state. It’s also helpful to include the refresh capability since you can trigger it with a swipe down even when the other device controls are locked.

rossetyler · January 12, 2022, 10:22pm

Yes. I identify each device with a unique device_network_id. For the LC7001 hubs, I use their MAC address. The lights supported by a hub are identified by a unique ordinal number. For these, I concatenate the hub MAC address with the ordinal number.

The device “added” has an id (UUID) that, I assume, is unique (it was just “added”, right?).

My “init” lifecycle handler seems to be the appropriate place to associate my value with the SmartThings device. Unfortunately/inappropriately, it is not being called reliably.

rossetyler · January 15, 2022, 10:42pm

This is still a problem.

Rebooting the hub does seem to “init” all of the previously “added” devices, some of which were not "init"ed immediately after being “added” (which is the problem). A reboot should not have to be required.

TAustin · January 16, 2022, 7:39am

Take a hard look at your channel handlers or timer routines to make sure they aren’t blocking for any length of time. Sounds like the driver thread may be starved.

Don’t loop on socket receives; process one message and exit and let the cosock select call your channel handler again.

Had exact same problem as this with a UPnP driver that was discovering and creating multiple devices in quick succession. Monitoring busy multicast addresses can do this.

rossetyler · January 17, 2022, 8:35pm

I am not blocking anywhere in my code (who knows what happens in try_create_device). All of my potentially blocking calls (cosock receives) are guarded with cosock.socket.select to ensure that they are ready first.

There are two problems here:

Not all try_create_device attempts succeed (actually create a device)
For those that do succeed (are lifecycle “added”), not all are lifecycle "init"ed.

I was, certainly, blasting out many try_create_device attempts at once (over 30) and experienced this bad behavior.

Now, before attempting another try_create_device, I ensure that the last one succeeded (was lifecycle "init"ed). The round-trip time slows down creation attempts substantially but, so far, I am able to discover all of my devices and create them successfully in one scan.

TAustin · January 17, 2022, 11:29pm

I feel your pain. I’ve had both these exact issues at one point or another during rapid discovery of multiple devices. If you’ve got your stuff on github I’d be happy to try and find anything obvious. And yes, I think at one point I tried putting a 1-second sleep between each device creation to try and alleviate the problem.

rossetyler · January 17, 2022, 11:51pm

Thanks.

I tried cosock.socket.sleep in all sorts of places with all sorts of values.
I really hate that kind of kludgy solution.
What I have now works better than anything else that I have tried.

rossetyler · January 18, 2022, 5:59pm

Solution is simpler now.
There does not seem to be a need to run a separate build thread during discovery.
try_create_device is either run

immediately upon discovery if last one has already completed being built
immediately after last one does complete being built (at the end of its lifecycle init)

TAustin · January 18, 2022, 11:28pm

How are you determining the status of the prior discovered device init completion?

rossetyler · January 19, 2022, 12:38am

For example, my DIMMER sub_driver, on a lifecycle init, creates a Dimmer Adapter for the device and adds it to those already built.

lifecycle_handlers = {init = function(driver, device) built:add(Dimmer(driver, device)) end}

built:add remembers this adapter and emits an indication that this event has occurred

add = function(self, adapter)
    local device_network_id = adapter.device.device_network_id
    self.adapter[device_network_id] = adapter
    self:_emit(device_network_id, adapter)
end,

built:_emit calls any handler registered for this event once (automatically un-registers it)

_emit = function(self, device_network_id, adapter)
    local handler = self._handler[device_network_id]
    if handler then
        handler(adapter)
        self._handler[device_network_id] = nil
    end
end,

A handler for this event might have been registered once in a built:after call …

_once = function(self, device_network_id, handler)
    self._handler[device_network_id] = handler
end,

after = function(self, device_network_id, handler)
    if not device_network_id then
        handler()
    else
        local adapter = self.adapter[device_network_id]
        if adapter then
            handler(adapter)
            return adapter
        end
        self:_once(device_network_id, handler)
    end
end,

… which ensures that the handler is either called immediately or once after this event happens.
built:after was done when a need for the build was discovered.

local last_device_network_id
local function build(device_network_id, model, label, parent_device_id)
    built:after(last_device_network_id, function()
        driver:try_create_device{
            type = "LAN",
            device_network_id = device_network_id,
            label = label,
            profile = model,
            manufacturer = "legrand",
            model = model,
            parent_device_id = parent_device_id,
        }
    end)
    last_device_network_id = device_network_id
end

Thus try_create_device is only called after the last device adapter was built.

TAustin · January 19, 2022, 4:08am

I have to say I bow down to your OO Lua skills. I thought I might be able to help by doing a quick scan of your driver on github but quickly realized your level of expertise and that it would take me some time to study your code to understand how you are even doing everything!

I’m glad you found a solution and I may have to look into doing something similar in my own problem driver. I intend to study your code in more depth to up my own game!

rossetyler · January 19, 2022, 6:09am

Thanks.

I do not know why I have to do this only that, empirically, doing so seems to work reliably. There is really no harm done waiting for the creation (build-built) round trip to complete as, at best, the device is unusable until it does and, at worst, is not usable until a hub reboot.

I hope this can help you and/or others.
Maybe someone can explain why this works or a suggest a better way.

rossetyler · January 19, 2022, 5:43pm

OO skills.

Although I have a lot of experience with OO, I am a lua newbie.
I just took the concepts from https://www.lua.org/pil/contents.html#16 and implemented them in a classify module.
I use these classify methods similar to how I would write OO in other languages and I stop worrying about the implementation.
I have added comments to classify.lua to better explain what is going on under the hood.
For the most part, I try to keep the hood closed.

blueyetisoftware · February 25, 2022, 7:51pm

I am seeing this as well. I have to say, as a Lua noob, I am lost in the code above. In my case, discovery finds a bunch of devices on a connected bridge and issues a bunch of try_create_device calls. Those devices all receive added and doConfigure callbacks, but the init is not called.

Reinstalling the driver works fine, as all of the existing devices have their init called.

Are there anymore insights on this?

blueyetisoftware · February 25, 2022, 7:57pm

One more note…

I added a work around of waiting in between device creation and it does init them for me. So my discovery loop essentially finds a single device, then waits 10 seconds, then finds the next device.

nayelyz · March 1, 2022, 8:16pm

Hi, continuing the discussion of this post on this thread:

The team mentioned the following:

The init lifecycle should be executed after an added event if the device was previously unknown.
Also, an init event should be triggered for each known device on startup.

If this is not happening, there might be a function you’re using that could be yielding the device’s thread (for example, calling receive on a socket or channel) which is preempting the init event from being able to call the callback.

rossetyler · March 1, 2022, 9:53pm

Yes, I expected that it should be but it is not, always:

I don’t think so. I presume it is in the “device’s thread” where all such lifecycle events would be handled. The problem is, my code is not even given the opportunity to handle its first one. That is, init is the first lifecycle event that I am prepared to handle and it isn’t even called.

I have worked around the problem by only issuing a try_create_device in the discovery thread if any previous attempt has completed its lifecycle init; otherwise, the next try_create_device attempt will be pending and will be performed just before the lifecycle init of the previous device returns. This results in a try_create_device → (init, try_create_device) → (init, try_create_device) … daisy chain.

While my work-around appears to work, I do not know why and I should not have to do it.

rossetyler · March 2, 2022, 11:34pm

This non-blocking “bug” driver exhibits the issue.

Topic		Replies	Views
LAN Edge device is not created in hub after successful discovery Developer Programs	6	140	October 17, 2024
SmartThings Edge Developer Beta \| Known Issues and Bug Tracking Support edge-device	139	12970	December 12, 2022
Preview \| SmartThings-managed Edge Device Drivers Devices & Integrations edge-device	299	63739	December 6, 2023
Is there any "built-in" behavior with Refresh command? Writing Edge Drivers	15	1604	January 29, 2023
[Edge drivers] Modifying a driver from another developer (having the source files) Support	70	1600	November 15, 2022

Try_create_device: though device may be "added", not always "init"ed

Related topics