Scheduler and Polling quits after some minutes, hours, or days

Hmmm…my Hue bridge shows no activity for several days and my hue bulbs never get their status updated if they are controlled by anything outside ST. This breaks the Big Switch I’ve been using.

@sidjohn1 suggests this might be another example of Sudden Scheduler Death Syndrome, as the Hue Connect smartapp runs a scheduled poll of the Hue Bridge which has apparently stopped at my house.

1 Like

SDSS affects us all. Tragic.

Both of my schedules died last night around that same time. Must have been a rough patch over in the ST servers. Otherwise, it’s been pretty solid having my monitor job restarting them. In case anyone’s curious, here’s my version of the RainMachine that runs a monitor schedule for the polling: GitHub - brbeaird/SmartThings_RainMachine at schedulerFix

I’ll probably try and add some of copyninja’s latest MyQ smartapp modifictions that have it subscribing to things like sunrise/sunset as well as multisensors as events that can restart the refresh polling scheduler.

Is this an industry wide affliction?

Calendar service providers never seem to fail to issue a scheduled event alert on time, every time.

No. My security system, my medical alert system, IFTTT, Amazon Echo, several Philips Hue control apps, iBeacon+, my thermostat, my DVR, various calendar and reminder apps, and my pool monitoring system all have regular scheduled events, and while an individual event may occasionally, although very rarely, be missed, it has never killed the schedule altogether.

1 Like

Maybe I should build and sell an external scheduling service…

Just be sure it scales :smile:

2 Likes

Chronos?

1 Like

Pretty much, worst is it’s a "fire and forget"style. Once stopped, it’s unlikely to continue onto the next scheduled time unless re initiated. ST needs to change it to “better late than never”

3 Likes

Is it just me or anyone else has observed issues related to scheduled apps and hello home actions lately? It seems to be happening quite a bit since last couple of weeks.

1 Like

Scheduler, yep. Discussing it here:

2 Likes

You beat me to pasting the same link!

Perhaps this new Topic is redundant.

But while I’m here, my guess as to the Top SmartThings ongoing issues:

  1. Scheduler (including runIn and dashboard solution SmartApps, etc.); jobs aren’t just delayed, they go to the abyss; jobs can’t reschedule selves if they don’t run.
  2. Presence (mobile more unpredictable than keyfob, but fob still insufficiently reliable).
  3. Periodic long event latency (e.g. delay between motion sensor trigger and switch on).
  4. Incomplete Hello Home Action mode changes.
  5. Mobile App slowness, crashes, and usability.
  6. Slow updates and other failures from Philips Hue
  7. Periodic IDE problems.
3 Likes

Add Sonos issues too, Terry. Except for #2 facing all of those. The worst is hues and #5.

2 Likes

If only ST actually cared about this… My Sunset apps failed last night. They’ve been solid for some time. All you can figure is that ST is Meh. Fixing it is not a priority, it would seem.

1 Like

If I had to bet, I’d guess that they’re banking on “Hub v2” offloading enough of the work from the cloud that it gets back to some sort of stable state rather than pulling people off of that to work on improving the scheduler.

1 Like

In general using chained runIns is very brittle. If you can create a schedule with the cron syntax, it uses a slightly different mechanism under the covers which is more robust since a single miss won’t break the chain.

3 Likes

First, they need to sell enough V2 hubs to make any difference. I bet lot’s of people are holding off their ST hub purchase until V2, but it may take a while for existing users to transition. Many existing customers would be hesitant to get on the V2 wagon with no obvious migration strategy.

3 Likes

This is not what we’ve observed. Apparently something occasionally kills runs that are scheduled via cron, and it kills all future runs as well.

3 Likes

Out of these… #1 and #4 are the most prevalent for me. I have seen issues on the other things on and off, but not something that is as prominent as #1 and #4. I wonder if ST is going through some back end changes to get ready for Hub V2. If that’s the case, then it would help if all of us are kept informed.

2 Likes

Interesting topic. I’ve seen missing scheduled events too.

It begs the question, if the cloud is going to (or has) miss a scheduled event, what should the platform do?

Let’s take the types of scheduling that ST can do:

-RunIn
-RunOnce
-Schedule with Cron like functions

and these built in ones:
runEvery5Minutes(handlerMethod)
runEvery10Minutes(handlerMethod)
runEvery15Minutes(handlerMethod)
runEvery30Minutes(handlerMethod)
runEvery1Hour(handlerMethod)
runEvery3Hours(handlerMethod)

All methods do not accumulate, meaning a missed event will not queue up and fire when noticed.

Also, there is about a 20 second window in which a scheduled event will try to run, after that, it purposely dropped.

The issue here is that in all instances, there is no “failed” notification.

We know it didn’t fire, but we don’t know what to do if that occurs.

Essentially all scheduling just dies and doesn’t restart.

App initialization and Update functions don’t fire either at this condition, so no way to jumpstart the schedule… It really boils down to the user having to go into the app and force it.

The challenge is to figure out how to recover from a missed schedule.

The best way is to always use CRON like scheduling as mentioned above, because this seems to recover better than any other methods described.

All other schedule commands or RunIn commands log a DateTime and the system is just checking if it is within a 20 second window after that time to then run the function and then removes itself regardless of success or failure of event handler firing.

Hopefully ST is working on a much more robust scheduling function to exist outside of a SmartApp that can help track down the source of these issues.

More over, someone needs to develop a smarter scheduler that can queue up events within reason or use LIFO or FIFO queuing models to determine state and what needs to fire when the next process cycle occurs to parse scheduled events.

Ultimately, the sandboxing of SmartApps having to all run their own schedules is at fault. Instead, each hub / location instance should have its own scheduling ability and then create schedules within this location “smartapp scheduler” to fire off SmartApp functions.

Anyway, its a fundamental problem. It is plaguing ST as an unreliable system and I think the reality is it is something that has to be fixed beyond just Hub v2, it can be addressed at the cloud level and at a programming issue.

Hope that helps…

3 Likes

If all future runs are being killed please open a ticket with the instance so we can dig into the issue.

1 Like