SmartThings Community

UPDATE: Recent SmartThings User Experience & Platform Performance

SmartThings Community,

This past weekend, performance of the SmartThings platform began to degrade due, primarily, to high load on the database and messaging infrastructure. As a result, some of our customers have experienced degraded access to and control of Smart Home Monitor, Routines, and other SmartApps within the SmartThings mobile app.

We are actively investigating the root cause of this critical issue in an effort to resolve current issues and prevent future incidents from occurring. This effort includes dedicated teams focused on near-term stabilization of the platform as well as teams focused on expanding the platform architecture. Despite this effort, we recognize the degradation impacts many of you individually, and we are sorry for the inconvenience this has caused.

Moving forward, we will continue to expand our database cluster and prioritize rewriting and optimizing our data access library to allow us to scale at a higher level.

Please continue to work with our support team at support@smartthings.com to address additional questions or concerns, or visit our status page for the latest updates. For information in real-time, follow @smartthingsdev on Twitter.


tl;dr

Why did this happen?
Due to stress on our database clusters, we are experiencing latency issues that are impacting Smart Home Monitor, Routines, and other SmartApps.

What is SmartThings doing to resolve this issue?
In the short term, we are expanding our database clusters and making sure we closely monitor the health of each cluster to mitigate data access issues. In the long term, we will be rewriting and optimizing our data access libraries to be able to scale at a higher level.

What can I do in the meantime?
For a temporary workaround, if you are having Smart Home Monitor problems, you can move your Smart Home Monitor “Security” rules to “Custom” rules where you can still receive notifications but avoid potentially inaccurate alerts due to disarming / arming failures.

  1. Open Smart Home Monitor.
  2. Touch the “Gear”.
  3. Touch “Security”.
  4. Remove your rules in the Security section and hit done.
  5. Touch “Custom”.
  6. Touch “New monitoring rule”.
  7. Setup your new monitoring rule.
  8. Continue to add rules for all your monitoring needs.

Where can I find more information?
Please continue to monitor the status page for updates and follow the @smartthings and @smartthingsdev twitter accounts.

23 Likes

Thanks for sharing the information and partial explanation, Tim…

But the root cause is not explained…

What has happened “since this weekend” that suddenly caused “stress on your database clusters”? Why the sudden and apparently utterly unpredicted spike in load?? :confused: :confounded:

15 Likes

Not only that, but what in the messaging/DB architecture caused application data/state in SmartApps (like Rule Machine) to be corrupted/unavailable?

I would expect high loads to degrade performance, not cause symptoms which mimic data loss/functionality.

11 Likes

When do you expect the expansions to be complete, and stability to be restored? Why were the database clusters stressed? Why was load suddenly higher this past weekend? Has load been higher since the incidents in late February? Thanks.

1 Like

Agree with @tgauchat. In what scenario would routine health monitoring not have triggered some sort of alert to operations folks that some resource(s) were being pegged where they could be proactively scaled before things start to break?

7 Likes

Honestly, why can’t you build out scalable infrastructure? It wouldn’t be that hard to do with the VM technologies out there or even AWS. You could have the infrastructure automatically scale out as needed with some orchestration, even add more sql boxes into the mix. If DB clusters are the underlining issue.

4 Likes

It would be nice if you guys focused on local processing and not just selective local processing. Then we wouldn’t rely so heavily on your servers.

28 Likes

This issue has caused me some trouble for a few weeks now. It is not just “this weekend”, as least not for me. I know it is something they are working on and I will be glad to see it fixed. Just want to say a big THANK YOU to SmartThings for staying on top of issues like this, working to resolve them and most of all keeping us informed as to what is happening. I think we all understand that new tech sometimes has bugs that need to be worked out, as long as we are in the loop it sure does make it easier to live with!

15 Likes

Absolutely this. I’m coming from a Vera Edge (okay, it’s been a few months now…) but I never had this level of instability/unreliability with that platform. Not saying I want to go back, but I do miss being able to rely on my rules to execute when they’re supposed to.

6 Likes

Why is my hub rebooting every few minutes? @slagle?

1 Like

I am actually thinking about getting the Vera Plus. Thoughts on Vera? Why wouldn’t you go back?

1 Like

Yeah you’d think it would be easy enough with AWS, but they can’t even setup a simple load balancer between US and European servers and instead have bizarrely created multiple graph URLS for users which break all the apps for EU users (and then months later still haven’t bothered to update these apps for EU users to point to the new URL so loads of stuff is broken)

Its a farce really, this is basic server management. I’m one person with no experience in it but had to setup a website for scalability and I learnt it in a matter of a week or two what was possible in terms of load balancing multiple geo located servers behind the single same URL and as you mention scalable VM servers that come and go as needed. Its not rocket science, let alone for a company backed by Samsung!

12 Likes

Wow, you guys are harsh. Thanks for the update @slagle.

7 Likes

Since you left this open for comment, what was the nature of this unexpected “high load”? Unanticipated user load, a malicious act (e.g., DDoS), or … ??

1 Like

DDoS from my Aeon Multi 6… I swear that thing puts out a ton of stuff in my logs :wink:

9 Likes

Same here, 100% I miss the reliability, but I don’t necessarily Want to go back.

I have a Veralite and a plus. Plus doesn’t have many zigbee devices “officially” working and it’s more difficult to add them since it’s new functionality for their platform. Their mobile app is trash compared to ST, but they do have a web UI. There are pluses and minus’ to both. That said you can read more about other platforms in this thread:

2 Likes

I’m not sure, but this would be the first time I heard this was happening as a result of these problems. I’d reach out to support and see if they can look into it for you.

I totally agree. While, not the best answer for sure, my first post mentions that were are still investigating the root cause.


I have invited our Director of Engineering to the thread though to give a more detailed response.

11 Likes

Harsh? I’m not critising an 8 year olds painting here - this is a commercial, relatively high price product backed by Samsung but at times seemingly run by amateurs (or those out of their depth in many situations)

SmartThings is a great idea currently executed very very poorly and unreliably. Especially if you happen to live in the EU and have been waiting for simple updates to apps like BeaconThings to come through for FOUR MONTHS now. I couldn’t and wouldn’t recommend SmartThings to anyone at the moment. Its still in Alpha nevermind Beta stage, which is ridiculous considering its been bought by Samsung and is 3 years into mature hardware and a version 2 hub!

18 Likes

Issues with Routines are also well documented on the community The status page finally admitted/mentioned that routines were impacted a couple hours ago when the community has had documentation of this for nearly as long SHM issues.

What about those of us who are experiencing other issues such as Routines? Is there anything we can do in the meantime?

I agree some in the community seem harsh, but I hope the staff has a better explanation for their leadership/executives than they are giving to the community.

1 Like