What Cloud does SmartThings use?

kcm117 · June 24, 2016, 2:45am

Does SmartThings use Google Cloud Platform? Amazon Web Services? Microsoft Azure? Multiple Clouds? Or are they using Samsung data centers?

With all of the constant outages, I feel as though a more reliable and highly available infrastructure is needed.

David_Montoya · June 24, 2016, 2:48am

if you had the IP to where the hubs talk back to, you could geo-locate it pretty easily to get your answer, in that it would give you the location of someone’s physical data center.

ady624 · June 24, 2016, 2:55am

AWS afaik.

JDRoberts · June 24, 2016, 3:23am

They use AWS, but most of the outages have been due to database issues, not transactional cloud access.

A Message from Alex on Platform Improvements and Our Plan Forward Announcements

First, I want to let you know that everyone at SmartThings is fully aware of the issues that have been affecting platform reliability. In the past few weeks, we’ve redoubled our efforts to make some fundamental improvements that will soon be felt by all of you in your everyday experience with SmartThings. Know that these improvements are only the beginning, that we are in this for the long term, and that we are absolutely committed to building the best, most open platform in the world. It’s just scratching the surface, but I’d like to take you through the technical details of some of the changes we’ve implemented recently: Smart Home Monitor We’ve made a few changes to Smart Home Monitor that should help to resolve issues with data consistency, load time, and Arming/Disarming. The fir…

.

Cloud Talk Meta

Post Mortem From 6/15/2016 Outage Link Notes From Cloud Engineer @vlad At 3:30 our monitoring tools alerted us that our API cluster went down. This is the part of the platform that serves graph.api.smartthings.com & mobile devices. This is when consumers would have begun to notice the crash. At this time our device cluster also began to struggle and monitoring tools alerted a spike in database connections. Engineering identified our caching layer as the source of the increased load on our databases. Operations on the caching layer began to fail, which pushed an overwhelming amount of traffic to our databases. This resulted in many operations timing out and decreased throughput. Upon further investigation, engineering identified a pattern: After a Cache server threw a certain excep…

leedjones · June 24, 2016, 5:09am

If they’re trying to make a fully resilient platform, wouldn’t they be better using both AWS and Azure for instance, so if one goes down the other can recover?
I know this has been done by some enterprises and has been proven to work

JDRoberts · June 24, 2016, 5:12am

They may be doing some of both, but again, it’s not the cloud going down that’s been the problem. It’s the database design.

Google “Cassandra hotspots.”

fightingmajor · June 24, 2016, 3:05pm

I’m very disappointed in all of you for leading this person on. Using your tech jargin to make things sound fancy so that it’s actually believable. @kcm117 please do not fall for this trick they are trying to pull on you. Everyone knows that the cloud Smartthings uses is Cirrocumulus. This is the high level cloud where all data is originally stored. If Cirrocumulus starts to bog down with information is it quickly spread out over Cirrus and Cirrostratus. When information is needed to be sent back to your hub, it travels through one of Altostratus, Altocumulus or Nimbostratus. Altocumulus is the most common mid cloud, more than one layer of Altocumulus often appears at different levels at the same time. Many times Altocumulus will appear with other cloud types. After the data is processed through one of these clouds, it then travels though one of the clouds closer to your hub which are called Cumulus, Stratus, Cumulonimbus or Stratocumulus. If you are ever having issues with your data, it’s probably because it went through Cumulonimbus. That cloud is the one that causes the most disruption amongst people. It can get very scary when stuck in that cloud. Everything starts spinning, giving you the feeling your data is caught up in a tornado.

brianjlambert · June 24, 2016, 3:16pm

With all the rain we’ve been having, definitely Cumulonimbus.

desertblade · June 24, 2016, 3:41pm

Using multiple cloud providers is difficult to sync everything across, then have to deal with routing traffic.

I know of companies running different systems on Azure and AWS. I don’t think many are running the same system on both, at least to scale.

vlad · June 24, 2016, 6:31pm

Cassandra was a big driver of issues earlier this year - though this has been largely mitigated. Hopefully the community in general has noticed the platform has been more reliable in general since March - which is the result of an organizational wide focus on solving those problems. You are correct in that AWS stability has rarely (no recent incidents come to mind) been a cause of downtime.

In recent memory - (high level) causes have been:

Caching failures which cause stress on our relational database (This was the really bad downtime issue that you quoted me on)
Network Connectivity failures with the services that connect to Hubs (Hubs going offline) - a lot of effort being put into finding root causes (Probably not fair to mark this as a single issue but I’m not sure how many details I’m allowed to share here )
Deployments. There have been some pretty drastic architectural changes to facilitate performance improvements and bug fixes. Occasionally some unforeseen bugs creep into Production but these have largely been very localized (think IDE only issues or affecting a small subset of users).

There are of course out standing issues out there that impact users on a day to day basis - engineering communicates with support frequently to get reports on what issues their team is fielding the most, which helps drive our prioritization. This is probably why Tim and Jody are constantly reminding everyone to contact support when they run into problems .

slagle · June 24, 2016, 7:30pm

fightingmajor:

I’m very disappointed in all of you for leading this person on. Using your tech jargin to make things sound fancy so that it’s actually believable. @kcm117 please do not fall for this trick they are trying to pull on you. Everyone knows that the cloud Smartthings uses is Cirrocumulus. This is the high level cloud where all data is originally stored. If Cirrocumulus starts to bog down with information is it quickly spread out over Cirrus and Cirrostratus. When information is needed to be sent back to your hub, it travels through one of Altostratus, Altocumulus or Nimbostratus. Altocumulus is the most common mid cloud, more than one layer of Altocumulus often appears at different levels at the same time. Many times Altocumulus will appear with other cloud types. After the data is processed through one of these clouds, it then travels though one of the clouds closer to your hub which are called Cumulus, Stratus, Cumulonimbus or Stratocumulus. If you are ever having issues with your data, it’s probably because it went through Cumulonimbus. That cloud is the one that causes the most disruption amongst people. It can get very scary when stuck in that cloud. Everything starts spinning, giving you the feeling your data is caught up in a tornado.

Ha! This is my favorite post this month. How did you know all our internal naming conventions?!?!

David_Montoya · June 25, 2016, 1:29am

This. This. So much this.

Topic		Replies	Views
ST Cloud Offline? (28 July 2022) General Discussion	55	1850	August 8, 2022
Samsung announces SmartThings Cloud General Discussion	53	5507	November 26, 2017
Change Smartthings cloud to Smartthing Fog computing? Connected Things	2	664	July 7, 2018
Syncs Data from smartthings cloud to AWS/AZure Connected Things developers	0	443	August 12, 2022
Newbie - developer advice request General Discussion developers	1	579	February 1, 2017

What Cloud does SmartThings use?

Related topics