Events not processing or extensively delayed (multiple rules engines)

cozdabuch · September 25, 2017, 6:58am

I’ve heard numerous people (Facebook, WC forum, in personal msgs, and here on ST community) using stringify, webcore and smart lighting having problems in the past 24-36hrs with things not happening properly, and hubs going offline. either not at all, partially, and/or extended delays in processing, or again, fully offline hubs

You haven’t pushed new firmware over the weekend right? Kinda sounds there’s a ST cloud issue going on…
@slagle @vlad @Jim

aruffell · September 25, 2017, 2:18pm

@cozdabuch - In my case it appears as everything is suffering much longer delays if they depend on the cloud whether they involve webcore or not. I was chalking it up to networking issues as other devices on my network (2 raspberry pi) were super slow too (anything I do over SSH takes ages) but if you are having similar issues then it looks like I am seeing the same thing (and the issues with the RPIs are unrelated… I do run beta sw on them!).

Automated_House · September 25, 2017, 3:40pm

i’ve noticed cloud based rules processing slow lately, too

JDRoberts · September 25, 2017, 4:49pm

This person reported a delay of several hours:

Slow or non-responsive things General Discussion

For a while, my SmartThings setup has been working great - it’s not a big system, perhaps a little over a dozen things under control, with some automation. Most devices are the GE zwave switches and dimmers. Recently, it’s all started to fall apart. Devices are slow to respond - tap them in the app and it may take seconds or minutes for the devices to respond. A few days back, I stopped waiting and tapped the wall switch and went to bed. The next day, I checked the notifications tab, and the switch off event was recorded three hours later. I’ll go to work on the system, and the switches will respond OK. AT the end of the day, I’ll tap the “good night” automation and the lights will remain on, sometimes responding with 1-2 minutes, some not at all. I don’t know what tools I have ava…

JDRoberts · September 25, 2017, 4:54pm

If you have this issue, definitely report it to support.

https://support.smartthings.com/hc/en-us

Glen_King · September 28, 2017, 1:39pm

I was wondering about things over the past couple days.
I notice ST went down sometime yesterday, and then came back up.

I’ll have to do a little inspection when I get home from work later.

Automated_House · September 28, 2017, 2:38pm

Support suggested I reboot the hub, so I did. Too early to tell if it will help.

vlad · September 29, 2017, 12:07am

There haven’t been any firmware updates lately and as far as known issues go the only one I am aware of is a problem with the Arlo integration where there has been a significant chance of their servers timing out, which drastically increases routine run time and may trigger the sandbox to be killed because it went over the max allowed execution time. This can result in partially completed routine/app executions. We have updated the Arlo integration to use async processing to mitigate the effect this has on other devices and this is currently going through verification. Apart from that we are not aware of any performance issues as you describe. Yesterdays incident was a very limited issue - functionally the only thing that was affected was saving and publishing custom smart apps. I’d suggest going through support with as much information as possible.

I’m not support but if someone has a specific instance of recent “slowness” with the platform, feel free to shoot me a DM (no promises in being able to help everyone who DMs but will try to get as many as i can) and I can take a look at logs. Would be really helpful if the following could be provided:

username
description of issue & expected results
name of app/devices involved
4. Times, specifically when the app was supposed to fire and when it actually did or how long it normally takes to execute and how long it took in the slow instance

bjthomas09 · September 29, 2017, 6:54am

Same here been having problems the last couple of days, I have a webcore piston that turns lifx and hue bulbs back to white from red when SHM is disarmed. The pistons have been extremely delayed and the lights are not shutting off reliably. 2 to 3 colors bulbs will not shut off and it is random as to which will not shut off. Sometimes lifx, sometimes hue bulbs, and sometimes both.

cozdabuch · September 29, 2017, 7:31am

@bscuderi13 input?

vlad · September 29, 2017, 8:09am

I do see some timeouts in your webcore piston but do not know enough about webCoRE to help troubleshoot as it looks like between locks being held by the piston & outbound requests to the LIFX API, the piston is being killed mid execution because a method invocation is taking more than 20 seconds. (The hard part is figuring out exactly where the slowness is coming from, could be the platform, database queries, outbound http requests such as lifx, webCoRE itself, etc…). Maybe if @ady624 is up for it we can try and figure out what is going on (and see if this more of a widespread issue). If you are up for it… could you start a group message with @ady624 and myself (if you feel comfortable sharing information on this piston with myself and @ady624)

ady624 · September 29, 2017, 10:10am

I am game

vlad · September 29, 2017, 5:58pm

FYI, we noticed a few database nodes that are exhibiting higher than normal latency at certain times throughout the day. This could potentially effect a routine execution, especially if the child devices are accessing device data. We are going to run some maintenence on these nodes to see if we can get them back to a healthier state. This specific latency issue is occurring only in NA02. The latency spikes are periodic so will see if the spikes continue after maintenenceand provide an update. I haven’t received any DMs so far, so just looking at overall health of various clusters atm.

Topic		Replies	Views
SmartThings Status says operational, but something is slow General Discussion	6	1209	September 26, 2017
Response time getting longer and longer Devices & Integrations	7	1225	August 1, 2019
Slow response to triggers and events Devices & Integrations	36	7708	March 10, 2018
Unacceptable performance (April 30, 2016) General Discussion	12	2659	May 5, 2016
Slow or non-responsive things General Discussion	4	5517	September 27, 2017

Events not processing or extensively delayed (multiple rules engines)

Related topics