ST servers..... Are they down?

Somebody posted on here claiming it was from him

Lol. Thanks! That sounds like fun. :yum:

2 Likes

One of the lucky one’s here as well…

I’m up now, but very flakey with huge lag.

Yeah, still seems whacky but at least I could get my smart bulbs turned off. They “edited” and removed my post of ST going down because I evidently wasn’t nice enough. Didn’t know they were policing these forums like that now. Also, what’s up with local processing being completely down too? So much for that pointless eco system.

Sorry guys, I had accidentally set my Blinks to poll every 5ms instead of 5m. One letter… :wink:

PS, 24 posts in 38 min, impressed by user response AND ST response. Everyone paying attention this evening I see.

3 Likes

Not sure if it was ever down for me however the IDE and mobile app seem to be quicker. Probably because nobody can access it right now! :blush:

1 Like

http://status.smartthings.com

1 Like

Ok back online. That was scary, I almost forgot how to turn lights on manually.

3 Likes

Mine is back up but now I have a smart motion sensor will not stop seeing motion. Uninstall, install, uninstall, install.

Anyone else having issues after outage?

I had exact same issue with my ST motion sensor a few months ago. Had to take it offline, It was making a switch I had configured it to turn on simply refuse to turn off becauese the motion was continuous even though nobody was in the room.

My 4 Fibaro motion sensors have never had this issue. I think it’s a hardware issue with the ST sensor.

Book marking this post.

2/2/2017 - NEVER FORGET.

2 Likes

Just wanted to chime in that this was caused by a pretty significantly hardware failure and the speed at which our ops team was able to catch the failures and replace everything was very impressive.

10 Likes

Thanks @vlad . Much love to operations teams… Since thats my career i feel their pain.

1 Like

Just out of curiosity, since you use AWS, was it a hardware failure on their end? If so, I assume you guys are just using some type of VM so it should have HAed over to another host? Or did AWS themselves have a larger outage that effected multiple customers?

There were failures for System Status checks - which are on the AWS side and the VMs had to be replaced. I’m on the software side so I don’t go too far in depth for retrospectives (talking to vendors/etc when something like that happens) so I’m not 100% sure what the underlying cause of the failure was/if there was anyone else affected but when the system checks fail that means its on the vendor’s end and according to documentation its usually network/power/hardware issues. (though this is a fairly rare occurrence) Yes, you are correct in regards to HA - unfortunately there wasn’t enough redundancy in place to weather the outage and when certain systems begin to struggle it can have a domino affect on others as well. There is definitely room for improvement and hopefully some work streams to increase redundancy in the systems that failed comes out of the retrospective.

Thanks for the info. Making an observation with little knowledge of the ST setup, it almost sounds like AWS had some type of underlining storage problem which caused the need to rebuild servers. Usually they would just failover to another host with little/no impact. Kudos to the team for acting so quickly - I know first hand how daunting rebuilding systems during an outage can be. Hopefully they are all enjoying a celebratory adult beverage right now.

1 Like

today i have Alexa saying she can’t reach ST but the command seem to work.

Also I have had to repeat commend two or three times to make, for example, the bedroom light turn on…

ho hum…

F[quote=“Freddy, post:38, topic:76580, full:true”]

today i have Alexa saying she can’t reach ST but the command seem to work.

Also I have had to repeat commend two or three times to make, for example, the bedroom light turn on…

ho hum…
[/quote]

Many people are having problems with the echo/smartthings integration right now:

Thanks…

1 Like