Scheduler and Polling quits after some minutes, hours, or days

Tried this already. Aaron S. gave me some suggestions but ended up letting me know that the engineers are aware of the issue and working on it.

2 Likes

Yeah, this is the problem. A lot of smartappā€™s use the schedule or runIn functions to run one time and in that function schedule the next instance.

This is why most are failing dead and not resuming.

ST has a run every X and CRON based scheduling that should be running all the time. These should survive a platform reset, but there seems to be isolated cases where they just stop until re-initialized.

I almost am tempted to sent up OAUTH endpoints for all my Schedule based smartapps and ping them once a day and call a function to unschedule and re-schedule my actions, which would ā€œresetā€ CRON based scheduling.

1 Like

Well, if all youā€™re wanting is a once per day check, Iā€™d go with copyninjaā€™s method of letting sunrise/sunset kick off that action via subscribe.

subscribe(location, "sunrise", runRefresh)
subscribe(location, "sunset", runRefresh)

Otherwise, Iā€™m with you in having an external OAUTH endpoint schedule trigger a check to reset the CRON. I got that working manually as a proof of concept. Itā€™s kinda messy, though, so Iā€™ve been hesitant to automate putting it in.

1 Like

Just for record, run everyXminutes fails for me.

1 Like

If the ā€œsystem wide, but localizedā€ sunrise/sunset events are occurring reliably, then why isnā€™t all scheduling reliable?! :confused:

Honestly, I havenā€™t tried this, so I guess I canā€™t technically confirm that it is reliable. However, most of my stock ST actions based on sunrise/sunset do work more often than not, so I assumed it was being handled slightly differently. You could also subscribe to mode changes, though.

subscribe(location, "mode", runRefresh)

Sure ā€¦ butā€¦ Most folks use a schedule (such as sunset) to change modes; so weā€™re in an endless loop here.

Fair enough. I leave for work at least once a day, though, which triggers my mode to change without a schedule involved.

1 Like

Iā€™ve done this multiple times. The outcomes have been:

  • Ignored for months on end
  • ā€œItā€™s being looked at by someone elseā€
  • It was ā€œan isolated issueā€
  • There were ā€œsome platform issues with loggingā€ that day.

Iā€™m sure you can understand why I chuckled when I read your post.

I can send you the log from my monitoring system of when itā€™s had to restart each of my scheduled apps. Between the 5 different SmartApps Iā€™m monitoring/restarting automatically, there have been 888 restarts since 5/29. Heck, I can have it e-mail support each time one of them crashes if itā€™d help.

2 Likes

I started with this, but theyā€™re crashing several times per day on average for me. I needed something a bit more proactive since a lot of what I depend on the schedules for is logging.

With due respect and as a total ST supporter from rock bottom of my heart how many tickets can members open again and again and again. Please do not take it as a sarcasm. We do care. You cannot simply focus on v2 only unless you get the current platform stable.

4 Likes

@brbeaird - sometimes you dont even need to @mention meā€¦ the mere typing of ā€˜aaronā€™ will work! (actually, I have been following this thread because this is really important to us).

@btk I had a fire drill with one of our problem tracking tags last week because it wasnā€™t clear all if the scheduled event failure tickets were getting recorded. We are good now. I also saw you had an open ticket that is in the queue for engineer review. I sent a request for someone to review and will try to nudge it again tomorrow. Can you shoot a follow up note to support@smartthings.com an updated example of a failed SmartApp with the following:

  1. Name of SmartApp
  2. Approximate time/date of failure
  3. Expected automation to occur

The more reports of failures that we have, the better Support can help @beckje01 and his team of engineers (who are almost as good looking as Support) can isolate the root cause of the issues.

2 Likes

Ron - Heyā€¦ I get it. We keep asking for you to submit tickets, and at the surface level, it seems nothing is done. I understand the frustrations. It really helps with the priorities, if those tickets go in. :frowning: We know what to focus on first, if thereā€™s issues to chose from.

Not everyone is working on v2. Some are. I can attest that people are working overdrive to fix, build, and repair v1 experience.

2 Likes

Ha! Awesome. Thanks for checking in!

*Iā€™ve probably asked for this a few times ā€¦ well, I know I have. :wink: *

If there is some way that you can publish an ā€œOpen Issues Listā€, then we can refer to the Issue Number in our Support Request (if we suspect it is related to the particular ā€œopen issueā€.

If we donā€™t know or cannot match an existing ā€œopen issueā€, then Tech Support could refer us to the ā€œopen issueā€ list that we can continue to track offline.

Would you consider this?

Iā€™ll presume that any ā€œLikesā€ on this post represent agreement and support of the idea.

10 Likes

Terry. Itā€™s on my list of things to implement. We know this. Weā€™re trying to integrate a upvote system service without having members to use another service just for that.

I agree- a open issues list is HUGE. Itā€™s one of my priorities to get it done.

2 Likes

Super appreciate that you concur with the high value of this and have made it a priority!


Considering that it it is a ā€œhigh priority itemā€ā€¦

Iā€™d suggest that the slight inconvenience of ā€œyet another service / toolā€ is worth overlooking.
Frankly, even a shared read-only Google Spreadsheet would be sufficient to see open issues with some short of reference number. Of course, there are plenty of bug-tracking platforms out there, but ā€¦ well ā€¦ donā€™t let the lack of a hammer stop you from using a using a brick to bang in this particular nail; if you understand that crazy metaphor. :smirk:

I reported that the Big Switch had failed, got told my hub has probably just lost connection with the Hue Bridge and to reinstall.

Which will fix the problem temporarily, but doesnā€™t solve the schedule dying, which I suspect is what happened.

2 weeks earlier I reported that my good morning Hello Home Action had failed to fire. I was told a server had ā€œhiccupedā€ and to delete and re add it. Which again, solves the problem temporarily but doesnā€™t address the schedule failure.

3 weeks before that I reported that Sleepytime failed. I was told to uninstall and re add it.

See the pattern? Because the delete/reinstall corrects that particular failure for that particular device, my guess is that it probably never gets added to the scheduler failed stats, even though none of this was custom code.

But the QOS issue remains the same.

2 Likes

Ahh . Thanks for your wonderful suggestion!

With more manual work, just remember, thatā€™s less time actually developing and implementing. My number one priority and primary focus is to get the platform stable. My second one is in developer experience: people worked hard making smartapps solutions that were sitting in a queue. Thatā€™s gone.
Oh, the joys of growing pains, hiring and recruiting. On the upside, Iā€™m seeing about 5-10 hires a week/2weeks.

One more thing - Weā€™re all one big family, and company - though community and support are two separate teams. Although I hear the needs- Iā€™ll also need the assistance from the support team. Tyler H and Ryan totally hears me on this. We know we need this. Weā€™re all on board. :slight_smile:

2 Likes

Youā€™ll get to me eventuallyā€¦ but Iā€™ll be snapped up by then : :broken_heart: