Perhaps it was related to a Routine. Please send an email to support and to me if you’d like and we will try to figure out exactly what happened for you. Very odd at the level where there must be some good explanation.
SmartThings uses a data store that replicates data across nodes for redundancy and performance. We temporarily reduced the consistency level of these queries to enable us to stay up while encountering unexpected load. We’ve since returned these settings to values that ensure consistency.
Status Page Resolutions
@alex - I appreciate the comments. As an IT Director, I do question how we got to here. Although I see you addressed some of it in an earlier reply, I still have a concern. DB load should never grow to the point that you are so far behind it. With regular monitoring, which I’m sure you do, expansion of the clusters should be routine - perhaps monthly? I do realize it is not as simple as “throwing more iron at it,” but it is to a point to buy time when needed.
I guess my overall concern is “how do we know it won’t happen again?” You will get more opportunities, as I would expect and hope, to expand and grow the business…with all the competing priorities that those opportunities entail. I just can’t help but feel that an “independent” infrastructure group is needed…one that can grow as needed to maintain stability regardless or what is happening the other parts of the company.
I also realize that I pay no monthly fee for this service - it was a one-time (well twice since I bought but never implemented v2 yet due to the lack of the promised conversion tool, but that is a different topic for a later day) payment. I do wonder if a small, monthly or yearly fee would also help ground the company…and I’m not talking SHM. When you have customers that are paying you on a regular basis, it tends to change your perspective on priorities since it is a different relationship than we currently have.
As much as we all want new features, I can guarantee that if you polled every customer, the majority would say stability, consistency, and speed is the number one priority.
Status Page Resolutions
Over here in the UK we still have fundamental issues with SmartApps / OAuth and GitHub integration. If your US infrastructure could be duplicated and load balanced so transparently wasn’t it possible to implement UK this way and avoid the ongoing problems we are still having here 6 months on ?
Man, the email below was a terribly tone-deaf message for SmartThings to send.
Let me say that I wasn’t upset about the instability of SmartThings for the past few months. It has affected me multiple times, but I’ve been so impressed with how you’ve communicated using status.io that I’ve felt informed and OK about all of the problems. That kind of ended with today’s email.
I’m left scratching my head and wondering why you would send such an uninformative and seemingly false apology. You didn’t need to apologize. Yes, you’ve had stability issues, but I, and most reasonable people, believe that you’re working like your hair is on fire to address these issues. You’ve kept customers apprised of EVERYTHING in an exemplary way for an extended period of time. Effective crisis communications are crazy hard to do without dropping the ball.The messages that you have time to “craft” like the one today should be comparatively easy.
Because I’ve let SmartThings crash in my house (literally recently) for longer than any family members I’m taking this time to write back to you on the topic of sucky communications.
Here’s an outline of why today’s email was Comcastic.
First, let us say that we’re sorry if you’ve experienced any issues with your SmartThings service in the last few weeks. Hopefully your engineers are better at arithmetic than your communications staff. “Few weeks”? Try months. There have been regular, as in multiple times per month, incidents related to performance and stability since November, 2015. You’ve already started this message poorly by misrepresenting reality. Whether the particular issues you’re referring to in this message occurred in the past few weeks or not is irrelevant to the customer who has experienced unstable service since V2 was released.
Your 2nd paragraph is vague and basically derivative of most of prior incident resolution messages. Based on the ongoing instability that I’ve seen after reading your status messages that indicate you’ve “made significant changes” to the physical or application layers I am unconvinced that customers will actually see increased stability.
We’d like to express our sincerest thanks for your understanding as we work to make a better, more reliable platform. When you apologize it can be perceived as insulting and dismissive to presume that you have the understanding of the person to whom you’re apologizing.
If you have any further questions or concerns, please don’t hesitate to contact our Support team… I didn’t ask you a question. You emailed me. In fact, the footer of your email states, “You cannot unsubscribe from these messages.” However, you make it pretty clear that you don’t really want to be contacted given that the reply-to address of your email is "firstname.lastname@example.org."
From the customer perspective, at least this customer, I think that SmartThings is great. You’re growing, expanding the devices you support and generally moving in the right direction - all very rapidly. You’ve taken customer satisfaction and communication very seriously and performed well in both categories until this amateur-hour email today.
Don’t screw around with communications that customers are unable to unsubscribe from. You will do nothing but anger people when they realize you’ve wasted their time.
On Thu, Apr 14, 2016 at 5:17 PM, SmartThings <email@example.com> wrote: First, let us say that we're sorry if you've experienced any issues with your SmartThings service in the last few weeks. We're aware that some customers have been affected, and we have been actively working on ways to improve the stability of SmartThings, and more specifically, Smart Home Monitor. One thing we've been able to do immediately is to move everyone over to our new backend scheduling database (the system that makes all of your Routines and scheduled actions possible), and this has already made a huge improvement. We've also enhanced our database in a way that will allow SmartThings to run far more smoothly going forward. We'd like to express our sincerest thanks for your understanding as we work to make a better, more reliable platform. We will continue to update you regarding ongoing efforts to improve the SmartThings platform. If you have any further questions or concerns, please don't hesitate to contact our Support team at firstname.lastname@example.org. Thanks, The SmartThings Team This is a system message sent to all SmartThings accounts. You cannot unsubscribe from these messages.
Missing Announcement Emails
Alex, will you or won't you provide a migration tool as promised?
@alex Thank you Alex, for the explanation, update and apology. It shouldn’t be unusual that a company admits fault, but it is. I appreciate you owning up to the problems and setting the record straight for Rule Machine. I have only been with ST a little over a month and I have enjoyed it a great deal. I have had few issues and actually started using SHM today. I am excited for the future and encouraged to get deeper into ST as a result of your statements.
Thank you again,
I wholeheartedly disagree with your interpretation. I appreciated the tone and intent of the message. As long as the communication channel stays open I am willing to stick around and deal with mishaps. Just as long as I am aware of what is going on and what is being done to fix and prevent them from occurring again. Time will tell. If we have another unexplained meltdown I will just close shop and go back to getting off my ass if I want something done.
I think this really comes down to all the changes we are making to our decision making processes and company priorities. Alex and the leadership team have done a really good job at setting and communicating the correct priorities for the next year and the top two are explicitly focused on achieving platform stability and reliability.
And I wholeheartedly agree with you.
SmartThings - Damned if you, damned if you don’t.
This would be true if we (SmartThings) were the only group extending the capabilities of the platform, but as Alex alluded to, our open stance comes at a cost (and also produces great gains for SmartThings and it’s customers). We sometime see huge spikes based upon 3rd party code and apps. We are working to better monitor and sandbox these when unexpected results occur in addition to getting better reporting and troubleshooting tools for 3rd party devs so they can create even better solutions.
@slagle has the “state” issue fix been rolled out yet or still to come? I had to restart from scratch after getting a replacement hub just a few days ago and I don’t want to put Rule Machine back until that fix is in. Thanks
We did push a change today targeted directly at this issue.
Awesome thank you… Just caught up on all unread and noticed. Much appreciated.
Last question… Is a hub migration tool something that will come to fruition? If yes - when is it currently slated to be available?
Backup tool??? With implied restore.
Maybe I’m not understanding this correctly. But how does a tech company, using a proven technology (the cloud), not have stability and reliability as the absolute #1 priority from day one?
How can that ever not be the focus?
How do you ever even perform a simple task, let alone a platform update, without calling stability and reliability into question?
Please tell us there is a while new division in ST, and that it’s called “Quality Assurance”
The simple fact that we can have an open dialog with CEO of this company at a time like this speaks volumes of how much they care about their product/service. It will get better. It will get much better and probably sooner than we all think. This is new technology with functionality that’s a small fraction of what it used to cost. There’s going to be growing pains.
Thank you Alex and Tim. Keep at it
Honestly no one can guarantee that, just as no one can guarantee you won’t lose electricity. Let’s hope that things will be as @alex promised. WE have nothing to lose but a few more months of frustrating experience, HE can lose his company if cannot live up to his own promises.
Ben, please tell me you are not trying to point the finger at 3rd party devs. Your post almost makes it sound like you are trying to pass along blame when people are using the platform for the exact purpose it was intended for.
I think we already went down this route this week and it didn’t turn out so well.