HealthCheck in Edge Driver

I’m writing an Edge driver for Tuya Zigbee Devices.

Everything works well with the new Edge Driver, but the only problem that I have is, the Health Check problem.

It gets OFFLINE, because battery powered tuya zigbee devices don’t report health signals nor respond to ping request.
So, with the Groovy DTH, I usually do like this

sendEvent(name: "DeviceWatch-Enroll", value: JsonOutput.toJson([protocol: "zigbee", scheme:"untracked"]), displayed: false)

So I would like to do something like below,

device:emit_event(capabilities.healthCheck.DeviceWatch-Enroll({protocol="zigbee", scheme="UNTRACKED"}))

but the “DeviceWatch-Enroll” has dash(-) in it’s name, it gives syntax error if I type as above.
I tried deviceWatchEnroll and DeviceWatchEnroll, with no luck.

Is there any solution for this?

@iquix

Is there any solution for this?

Using the health check capability is no longer needed in edge drivers like it was for DTHs. Device watch used to be a service we used in the cloud, but has for some time been local on the hub. For Zigbee and Z-Wave, the hub will decide a devices health (online/offline status) based on radio communication, and the configuration of the device. Power source is one of the factors considered in the tracking of those devices’ health. For LAN devices, you can opt into device health using the using the online() and offline() device APIs, and simply control this from within the driver.

Then, is there no solution for battery powered tuya zigbee devices getting OFFLINE?

Does the device stay marked offline after the device has communicated with the hub? This would be unexpected. However, I have noticed that right after join devices are sometimes offline until there is successful communication.

I added wireless zigbee curtain yesterday with the Edge Driver that I’ve made, and everything worked well yesterday.
and I left the curtain open for more than 24hours, (which means no packet was communicated between curtain and the zigbee hub for more than 24hours)

And I found out that the curtain is OFFLINE today.

I understand, just to be clear, the curtain is controllable now but is still offline? If it is not controllable currently, I think the device health has been set properly, but if it is controllable I will need to look into it more.

Oh, I just found out that another shade is also offline.


I installed 3 tuya window shades devices with my Edge Driver.
: One is battery powered, and Two are plugged in.
Everything were working fine with all of them.

I just found out that 2(=one battery, the other plugged in) of them are offline.

The shade that is still online (and everything is working fine) is the device that I’ve open/closed today. And the shades that got offline were left open for about 1 day.
(It just turned ou that it has nothing to do with the power source. Sorry for making you confused with the power source.)


Offline device is not controllable with the app, nor the CLI with smartthings devices:command.


I’m the original author of the tuya window shade Groovy DTH, and I never got offline with the DTH for years.
Smartthings/tuya-window-shade.groovy at master · iquix/Smartthings (github.com)

There must be some problem with the health check system in the Edge Driver.

There very well could be issues with parts of the health check system as some components are being newly tested with the beta release. Based on the DTH, I think these devices would actually be added to device watch as being tracked when they are backed by a driver. So that would explain why they might sometimes get set offline with a driver, but not a DTH. We do have some ability to configure how/which devices are tracked. Many other zigbee devices (including battery powered devices) on our platform are tracked because the device is configured to report an attribute every so often. Do you know if these tuya devices send periodic attribute reports to the hub?

Tuya devices uses their own manufacturer specific cluster (0xEF00) for all the messages.

  • Battery powered curtain device never sends message unless I send any commands to the curtain, or generate packet by manually pulling the curtain. Almost every battery powered tuya zigbee device never sends heartbeat message.
    (However, It never got offline with the DTH before.)

  • The plugged device sends heartbeat messages with ZCL Basic Cluster every 2minitues, however, I did’t assigned a handler for the Basic Cluster. Can this be the problem?

Tuya Window Shade  <ZigbeeDevice: 0ecde1f9-44e7-43fe-b2b5-b94668c16f8b [0x5D5A] (블라인드)> received Zigbee message: < ZigbeeMessageRx || type: 0x00, < AddressHeader || src_addr: 0x5D5A, src_endpoint: 0x01, dest_addr: 0x0000, dest_endpoint: 0xFF, profile: 0x0104, cluster: Basic >, lqi: 0xFE, rssi: -50, body_length: 0x0007, < ZCLMessageBody || < ZCLHeader || frame_ctrl: 0x08, seqno: 0x00, ZCLCommandId: 0x0A >, < ReportAttribute || < AttributeRecord || AttributeId: 0x0001, DataType: Uint8, ApplicationVersion: 0x49 > > > >
local CLUSTER_TUYA=0xEF00
local tuya_window_shade_driver = {
	NAME = "tuya-window-shade",
	supported_capabilities = {
		capabilities.windowShade,
		capabilities.windowShadePreset,
		capabilities.switchLevel
	},
	zigbee_handlers = {
		global = {},
		cluster = {
			[CLUSTER_TUYA] = {
				[0x01] = tuya_command_handler,
				[0x02] = tuya_command_handler
			}
		},
		attr = {}
	},
	capability_handlers = {
		[WindowShade.ID] = {
			[WindowShade.commands.open.NAME] = window_shade_open_handler,
			[WindowShade.commands.close.NAME] = window_shade_close_handler,
			[WindowShade.commands.pause.NAME] = window_shade_pause_handler
	 },
		[WindowShadePreset.ID] = {
			[WindowShadePreset.commands.presetPosition.NAME] = window_shade_preset_position_handler
		},
		[SwitchLevel.ID] = {
			[SwitchLevel.commands.setLevel.NAME] = window_shade_switch_level_set_level_handler
		}
	},
	lifecycle_handlers = {
		doConfigure = do_configure,
		added = device_added,
		infoChanged = device_info_changed
	}
}

Battery powered curtain device never sends message unless I send any commands or manually pulling the curtain. Almost every battery powered tuya zigbee device never sends heartbeat message.

We suspected that there would have to be some tweaking of the device health system, and so I will discuss internally what we can do to ensure that battery powered devices like these are always untracked and thus kept marked online.

The plugged device sends heartbeat messages with ZCL Basic Cluster every 2minitues

This should cause the device to stay online. Can you PM me your email or hubID so that I can look into this further? The device should be marked online as soon as we receive communication from the device.

It also surprises me that neither the battery shade or the plug in shade is controllable when the device is incorrectly marked offline (if the device really is online and we have just marked it offline incorrectly, we should still be able to communicate with it).

I didn’t assigned a handler for the Basic Cluster. Can this be the problem?

I dont think this is the problem.

@iquix I look forward to an edge driver for the plug-in Tuya curtain motor. Do you think it will work with this model?
image

I have never seen packets of this model before.

All tuya motors have same open/close/pause commands. So commands will be OK.

But in “move to position” command, some device regards 100% as open, but the others regards 0% as open.
Also they have quite a different way in reporting their working state and position state.

I can’t make device drivers/DTH of this model without looking at the packets.

plus… it seems that it’s a SLEEPY_END_DEVICE.
This could be problem in health checking in edge drivers, as we’re discussing in this thread.

1 Like

One more question is…

Does ‘insecure rejoin’ work in the edge driver devices?

In case that those curtain motors got really disconnected… and failed to rejoin.

(I always allowed insecure rejoin in the settings.)

Insecure rejoin works for devices that use a driver.

May I belatedly pick up on this? It is so rare to get any definitive information on what on earth we are supposed to do with device health, and we’ve been trying for years now. It has significantly changed at least three times in that period and it seems like we are using a fusion of different systems at times. We sometimes get a snippet of information but it is always accompanied by some extra uncertainty.

For example, in Check Your Device’s Health | SmartThings Developers we are confidently told this:

Opt-in for Health Check#

Devices must opt-in to have their health status checked on the Platform. To enroll, the healthCheck capability must be set when the device is installed. Hubs do not need to be explicitly enrolled; when a hub is onboarded it is automatically enrolled.

I have always thought that was rather unnecessary as the healthCheck capability became pretty useless for users some time ago when healthStatus and the DeviceWatch-DeviceStatus stopped reflecting the connectivity status (except when they didn’t) and a host of webCoRE users no longer knew when their devices went offline. So since then it has seemed a bit unnecessary for us to even be aware that it exists.

My understanding is that the healthCheck capability is actually mandatory in the C2C Schema thingy and I believe the healthStatus attribute is used to indicate online and offline status. At least it was last time I read about it, which is a while ago now.

It has never been totally clear what Groovy DTHs actually required, and looking at how it is done in stock DTHs is like looking at a ‘collection of things that are different’ (© Bungle from Rainbow) and becomes bit of a history lesson. There were two core elements that came to the fore: you could invoke a mystical incantation (in an earlier post) to disable tracking for things like buttons, virtual devices and cloud and LAN devices, instead using healthStatus and/or DeviceWatch-DeviceStatus if you did want to explicitly indicate online and offline status; you could set the checkInterval to tell the system how long a period of inactivity should be tolerated based on your knowledge/guess of how the device behaved (though what actually constitutes ‘activity’ has never been revealed).

So if Edge drivers don’t need to invoke the healthCheck capability then great, but if you don’t need any equivalent for checkInterval or to be told not to track the devices, then I think it probably needs emphasising as it is a big change, even if we didn’t know how it worked to start with. Expect us to wonder how you are doing it too.

3 Likes