Custom Capabilities in Edge Drivers

I have an Edge driver working all fine and dandy on one hub. It uses a custom capability.

local binarySensor = capabilities[ 'schoolnature13873.binarySensor' ]

It did put up a fight though, but not as much as the second hub I tried to install the driver on today. I had to concede that one. The problem is that the capability simply can’t be read and an error is thrown which creates a restart loop.

2023-07-27T14:48:27.366485238+00:00 WARN Anidea LV Devices  Unexpected filesystem lookup for capability schoolnature13873.binarySensor
2023-07-27T14:48:27.418427321+00:00 FATAL Anidea LV Devices  callback error
stack traceback:
        [C]: in field 'st_json_decode'
        [string "json"]:17: in field 'decode'
        [string "st/capabilities/init.lua"]:56: in field 'build_cap_from_json_string'
        [string "st/capabilities/init.lua"]:142: in function <[string "st/capabilities/init.lua"]:134>
        [C]: in function 'copcall'
        [string "st/capabilities/init.lua"]:94: in metamethod '__index'
        [string "init.lua"]:179: in main chunk
caused by: EOF while parsing a value at line 1 column 0

For the first hub I managed to find a combination of waiting, reinstalling and rebooting the hub that made it work. Today I failed miserably, and as this was an update to a key driver I had to concede defeat.

Is there some trick to making this work as it should?

I am going to follow up my own posting.

The WARN message above comes because the capability can’t be retrieved from the API and the fallback is to read it from a file in the .../st/capabilities/generated/ directory on the hub filesystem. The FATAL comes because the library isn’t expecting that to fail. Clutching at straws I thought that perhaps something primes the generated folder with capabilities and the only thing I could think of was the local profiles. My driver is for local VIRTUAL devices so it only seems to a need a profile to prevent errors in packaging. Anyway I added the capability to the placeholder profile I had created and all is now working.

Unfortunately I forgot to check a fix was still needed so I don’t actually know if the above is basically cobblers and I just got lucky.

Hi, @orangebucket
When you ran the command to create a virtual device using your custom driver, did the device profile you chose to be used had the custom capability you mentioned?

The team mentioned that to use a custom driver for a local virtual device, the driver must support all the capabilities included in the selected profile, so I think it’s also required for the profile to have the capabilities that get called.

The issue I had occured when the Edge driver was installed on a hub (more correctly it was an update to the driver which was being used by about forty devices already). It was unable to start up (thus breaking the devices) because of the fatal error in loading the custom capability which is used in the command handlers. New virtual devices aren’t even being created at this point in the proceedings. For all I know this could be a problem regardless of the device type.

The fatal error occurred because the custom capability could not be loaded from the API and the fallback code expected to be able to read it from the local filesystem instead and couldn’t, presumably because it wasn’t there.

What I am really trying to establish is how to avoid the fatal error in future. Why couldn’t the capability be downloaded?

When the capability can’t be downloaded the only way to avoid a fatal error seems to be to successfully read and process a ‘generated’ copy from the hub filesystem. When and why does that get generated (if it does), and how does it get there if the capability can’t be downloaded?

The other alternative is that if a custom capability can’t be downloaded from the API there is no fallback and a fatal error is to be expected (a more graceful failure is hopefully on the to-do list). If that is the case then what can I do to change the situation?

As things stand I have a driver that I daren’t develop further with other custom capabilities as I could knock out my existing devices for an unknown period of time.

I have avoided talk of profiles here. For virtual devices the profile is completely separate from the Edge driver and there would be no reason to even have a profile defined in the profiles/ folder if the driver could be packaged without one. Unless the profile has a secondary function.

Hi @orangebucket, recently the team is working on virtual devices, which could affect this, Can you reproduce the problem? I was doing some tests but I can’t reproduce the issue.

I’ll have to do some testing to see if I can repeat the problem.

Normally when working with Edge drivers you are specifying all the profiles that the driver uses in the profiles/ directory and they get selected by the internal name in the fingerprints for ZIGBEE etc, or when you create a device in LAN drivers or request to change the profile. When you upload driver packages these implicit profiles get published as proper profiles that you can see in the /deviceprofiles endpoint in the API and in the device objects.

With custom virtual devices you have to create the device profiles directly and specify them by ID. You are never using the internal profiles in Edge drivers at all.

My suspicion is that the custom capabilities need to be referenced in an internal profile in order to generate a copy of the custom capability on the hub filesystem and prevent fatal errors if the custom capability can’t be read from the cloud. That’ll be my test approach anyway.

I have done some testing using a hub running 49.7 and I have been able to repeat my previous findings with one difference. That difference is that instead of the driver restarting every few seconds in an infinite loop (or at least one that goes on for a very long time), the driver restarts a few times and then backs off for a much longer period before trying again.

The loop is caused by the driver attempting and failing to load a custom capability from the API, which results in a warning, and then falling back to loading it from the hub filesystem but not allowing for the possibility that it isn’t there. That is the fatal error.

If I create a profile in the Edge driver that references the custom capability the error doesn’t arise.

Why wouldn’t I being doing that anyway? Simply because the device profiles for VIRTUAL devices are created independently of the Edge drivers, so my natural responses are:

  1. Not to use the implicit profiles at all. That can’t be done because you have to have one to upload a driver package (or at least you do if you use the CLI). So I move on to:
  2. To create a minimal profile that I never need to touch again. That turns out to be where I went wrong. I had to have the custom capability in there.

Just to add to the fun, once you’ve had the custom capability working once it doesn’t break again.

please, Can you help us with the following information to investigate?

  1. DriverId
  2. Driver logs
  3. Hub logs with the time/date when the problem happened

with that information, I can create a report :saluting_face: