@April, Can you kindly push the engineering team to look at what is happening with the ST or the ST refreshing apps like Weather Underground. I did open a ticket for this. The Weather app does not refresh unless I and a lot of people on ST force refresh it via the Refresh button in th ST app “Things” tab. This is a pain. I even asked @Tyler for any help he can provide. And before you say it, I really do understand the burden of being in IT support field. But ST is falling short.
P.S. Are you going to be starting a Pre-order for Hun 2.0 soon?
I once had an engineering professor who told what is probably an apocryphal story of a mechanical engineering professor back in the 1950s who kept an aquarium with goldfish in the lab. To pass the course, you had to build one device that added value to the aquarium (light, heat, pump, feeder, whatever) and leave it running for one week unattended. There was only one requirement to pass: “Don’t kill the fish.”
Like I said, probably not a true story, but it made the point. We used to run DKTF tests on new systems/devices in beta…could it run hands off for a week with no catastrophic failures.
I suggest giving a plastic toy fish to everyone who decides how the engineering budget gets spent, and then to all the engineers, as a reminder that “reliable” doesn’t mean the same thing as “will work OK again if you reinstall.”
The first priority of any home automation system should be the simplest:
Don’t Kill the Fish.
@JDRoberts. Love the story. I heard something similar. I am still here, so that means that my house didn’t kill me. That said, it’s the miner but still useful parts of ST platform failing which drives me up the wall.
Thanks for this. I’ve now got a scheduler monitoring app named “Don’t Kill The Fish” running to give support some feedback. This way I can keep restarting my important apps but still have something that can be an acceptable SSDS casualty. =)
It displays your image if everything is ok and switches to this if it’s not:
Had another random fail overnight. Visible in the “Weather Station” device - did not update after a while:
Weather station does this:
runPeriodically(3600, poll)
And you can see it just stop last night (I kicked it this afternoon when the wife called and said lights were turning on for no reason - because the lights thought it was dark out!)
I’ve got an open ticket with ST support regarding an issue I’m having with my weather station app not updating with manually refreshing it. What’s interesting is the support person didn’t think this scheduler issue is related. He also suggested I reinstall the weather station app to get it working again. Although I won’t be surprised reinstalling would get it working, I find that solution to be sub-optimal…sigh.
That said, I’ve been talking to Aaron and was able to give him some statistics on when and how often my SmartApps are getting struck by SSDS as well as a couple of cases of apps currently in a dead state so they can do a post-mortem.
Here’s hoping that information is useful and helps them track this down.
Failed for a few days (listed as “inactive” by the hub, Big Switch failed). Refreshed Hue Connect. Ran for about 24 hours, then failed again.
But the Hues work great with IFTTT, a couple of third party apps, Beecon+, and the native Hue app. It’s only ST that can’t maintain the connection.
2 Likes
tgauchat
(ActionTiles.com co-founder Terry @ActionTiles; GitHub: @cosmicpuppy)
94
And it’s apparently contagious.
After months of consist though not perfect performance, my Hues controlled via SmartThings now fails more often than not.
And they work fine via Hue App and Echo.
Look… I’m perfectly willing to hunker down for a month or 2 or 3 and figure this out. But not for free. And not for Hot Pockets either.
I’m definitely not implying that ST’s engineers can’t fix things… They certainly could do it faster than myself… But we have agreed that ST has higher priorities. That’s fine. So… outsource it?
Sure, politeness is a virtue, but truth to be told, I don’t believe they can. I have created Pollster almost a year ago primarily to poll my custom WiFi Thermostat device handler because native device polling never worked reliably. Pollster has been working like a charm until recently when it too began failing at least once every week, requiring a restart. I’m looking at my thermostat device now and see that it has not been updated since Tuesday, 7:15 PM - more than two days ago.
This ‘thing’ is now totally useless. Not only these clowns are unable to fix the old bugs, they just keep piling the new ones on top. And to top it off, they’re now telling me that Pollster does not meet their app submission requirements because “polling devices is not recomended”. What a joke! What is recommend then? Sitting with your thumb stuck in you know where? Oh, and phleeesee, keep sending those bug reports to support because, you know, we’re so busy hiring, doing hackathons and what not, we cannot prioritize our bugs without your help.
This has nothing to do with April. Platform stability should be priority number one for any cloud-based service. Period. They’re well aware of the problems, there’s no doubt about that. So all this talk about “we need more customer complaints to prioritize this issue” is just nonsense. A while ago the main excuse was “we’re understaffed, but please bear with us because we’re hiring”. Six month later, it’s “we’re so busy hiring, we don’t have time to get things done.” It’s ridiculous, if you ask me. I wonder what it’s going to be six month from now when they run out of excuses.
You’re right. Platform stability is the number one priority. Hiring and understaff is the same, and you’re right. It’s something we are facing. With growth, too though, communication also gets lost. Thus Bool is not evaluating as true/false instead of a string, and I wasn’t aware of the enum change until later. - infact, I learned it at the Dev call.
These are things we now have a process to communicate because of this. You’re right to call out of these excuses. They are the state of the business, and these are the challenges we’re facing.
With transparency, we share with that being understaffed and hiring is an issue. Without transparency, and putting our heads down to fix the issues, we are called out for not being on the forums and without communication.
We’re glad you’re here. We’re glad you’re calling us out to do better. We’re working on it. There’s no excuse that your experience is not ideal, nor do we thrive for that experience.
8 Likes
tgauchat
(ActionTiles.com co-founder Terry @ActionTiles; GitHub: @cosmicpuppy)
100
Based on this thread, so far, does this seem like the best approach for maximum scheduling reliability?
Just trying to distill the discussion into a “Conclusion”.
Here you go! Just insert a call to updateState() at the top of whatever scheduled method you want to monitor. Sticking a call to logURLs() in your updated() method will spit out some easy-to-copy URLs in your logs.
Sigh. All THOSE are going to put more load on the system, which (apparently) is already struggling under load.
It would be so much better if it just worked - most importantly if it never just silently dropped scheduled callbacks - better late than never would be a great start.
(my lighting on/off based on motion + lux works about 60% of the time - when it fails a refresh of the weather tile (which should have refreshed based on both sunset AND periodic internal schedules) magically fixes it - because the weather tile is providing my lux, and the scheduler has died. it’s really like every 2nd or 3rd day now. Not great.)