Try_create_device: though device may be "added", not always "init"ed

I am creating a Legrand RFLC edge driver for my SmartThings hub that will mDNS-discover neighboring Legrand LC7001 hubs on its LAN. For each LC7001 discovered, my LC7001 sub_driver handles its lifecycle init. Subsequent device discoveries will find each RF Switch and Dimmer that the LC7001 controls and request a device be created (try_create_device) for each of them. These are handled by my Switch and Dimmer sub_drivers.

Not all of these device creation requests are honored. That is, not all result in a device lifecycle “added” event. For those that are, I will see these devices in the SmartThings app. Unfortunately, even for those that are “added”, not all of them are "init"ed. My lifecycle init handling is where I device:emit_event the current device state. Since this is not done, all these devices in the app say “Checking…” and, since I have not accommodated for them, cannot be controlled.

The current documentation suggests that “added” is called only when first added but that “init” is called afterwards and always during device initialization. So, I am expecting to always handle a lifecycle init event but it does not always come. Why?

Also, in the SmartThings IDE, these don’t show up as “local” devices (Execution location is “cloud”). Why? Making these execute locally on the SmartThings hub is the whole point.

I can’t answer any of the other questions, but the IDE will be going away when the groovy cloud goes away and it has not been updated to work with edge drivers. So edge drivers just show up as “placeholders“ and their method of running local is not the same as the previous Architecture. So not being marked local in the IDE doesn’t have any meaning as far as edge drivers go.

1 Like

Are you sure each device you’re trying to create is getting a unique device ID?

Init is also called whenever the driver restarts, including when the hub is rebooted or the driver is updated, so you have other options for querying the device for an initial state. It’s also helpful to include the refresh capability since you can trigger it with a swipe down even when the other device controls are locked.

Yes. I identify each device with a unique device_network_id. For the LC7001 hubs, I use their MAC address. The lights supported by a hub are identified by a unique ordinal number. For these, I concatenate the hub MAC address with the ordinal number.

The device “added” has an id (UUID) that, I assume, is unique (it was just “added”, right?).

My “init” lifecycle handler seems to be the appropriate place to associate my value with the SmartThings device. Unfortunately/inappropriately, it is not being called reliably.

This is still a problem.

Rebooting the hub does seem to “init” all of the previously “added” devices, some of which were not "init"ed immediately after being “added” (which is the problem). A reboot should not have to be required.

Take a hard look at your channel handlers or timer routines to make sure they aren’t blocking for any length of time. Sounds like the driver thread may be starved.

Don’t loop on socket receives; process one message and exit and let the cosock select call your channel handler again.

Had exact same problem as this with a UPnP driver that was discovering and creating multiple devices in quick succession. Monitoring busy multicast addresses can do this.

I am not blocking anywhere in my code (who knows what happens in try_create_device). All of my potentially blocking calls (cosock receives) are guarded with cosock.socket.select to ensure that they are ready first.

There are two problems here:

  1. Not all try_create_device attempts succeed (actually create a device)
  2. For those that do succeed (are lifecycle “added”), not all are lifecycle "init"ed.

I was, certainly, blasting out many try_create_device attempts at once (over 30) and experienced this bad behavior.

Now, before attempting another try_create_device, I ensure that the last one succeeded (was lifecycle "init"ed). The round-trip time slows down creation attempts substantially but, so far, I am able to discover all of my devices and create them successfully in one scan.

I feel your pain. I’ve had both these exact issues at one point or another during rapid discovery of multiple devices. If you’ve got your stuff on github I’d be happy to try and find anything obvious. And yes, I think at one point I tried putting a 1-second sleep between each device creation to try and alleviate the problem.

Thanks.

I tried cosock.socket.sleep in all sorts of places with all sorts of values.
I really hate that kind of kludgy solution.
What I have now works better than anything else that I have tried.

Solution is simpler now.
There does not seem to be a need to run a separate build thread during discovery.
try_create_device is either run

  • immediately upon discovery if last one has already completed being built
  • immediately after last one does complete being built (at the end of its lifecycle init)

How are you determining the status of the prior discovered device init completion?

For example, my DIMMER sub_driver, on a lifecycle init, creates a Dimmer Adapter for the device and adds it to those already built.

lifecycle_handlers = {init = function(driver, device) built:add(Dimmer(driver, device)) end}

built:add remembers this adapter and emits an indication that this event has occurred

add = function(self, adapter)
    local device_network_id = adapter.device.device_network_id
    self.adapter[device_network_id] = adapter
    self:_emit(device_network_id, adapter)
end,

built:_emit calls any handler registered for this event once (automatically un-registers it)

_emit = function(self, device_network_id, adapter)
    local handler = self._handler[device_network_id]
    if handler then
        handler(adapter)
        self._handler[device_network_id] = nil
    end
end,

A handler for this event might have been registered once in a built:after call …

_once = function(self, device_network_id, handler)
    self._handler[device_network_id] = handler
end,

after = function(self, device_network_id, handler)
    if not device_network_id then
        handler()
    else
        local adapter = self.adapter[device_network_id]
        if adapter then
            handler(adapter)
            return adapter
        end
        self:_once(device_network_id, handler)
    end
end,

… which ensures that the handler is either called immediately or once after this event happens.
built:after was done when a need for the build was discovered.

local last_device_network_id
local function build(device_network_id, model, label, parent_device_id)
    built:after(last_device_network_id, function()
        driver:try_create_device{
            type = "LAN",
            device_network_id = device_network_id,
            label = label,
            profile = model,
            manufacturer = "legrand",
            model = model,
            parent_device_id = parent_device_id,
        }
    end)
    last_device_network_id = device_network_id
end

Thus try_create_device is only called after the last device adapter was built.

I have to say I bow down to your OO Lua skills. I thought I might be able to help by doing a quick scan of your driver on github but quickly realized your level of expertise and that it would take me some time to study your code to understand how you are even doing everything!

I’m glad you found a solution and I may have to look into doing something similar in my own problem driver. I intend to study your code in more depth to up my own game!

Thanks.

I do not know why I have to do this only that, empirically, doing so seems to work reliably. There is really no harm done waiting for the creation (build-built) round trip to complete as, at best, the device is unusable until it does and, at worst, is not usable until a hub reboot.

I hope this can help you and/or others.
Maybe someone can explain why this works or a suggest a better way.

OO skills.

Although I have a lot of experience with OO, I am a lua newbie.
I just took the concepts from Programming in Lua (first edition) and implemented them in a classify module.
I use these classify methods similar to how I would write OO in other languages and I stop worrying about the implementation.
I have added comments to classify.lua to better explain what is going on under the hood.
For the most part, I try to keep the hood closed.

1 Like