Incident Report - SmartApp state - July 25, 2016

@slagle It’s all semantics, but the details are important!

This statement says SmartApp functions returned to normal, period.

This is not accurate.

No where does the status or incident report say, explicitly or even imply, that individual user systems were left in a degraded state and that users would need to look for such issues (how do they do that exactly?) and resolve them on their own to restore full functionality.

Look, I am not trying to be a jerk - but this is blatantly obvious and if we’re owning it, we need to say it.

1 Like

Perhaps there are different expectations for an incident report (post mortem). This is not supposed to be a “customer” alert. It is meant for the more technical to tell them what happened for what they already know. We lost state.

Here’s the progression of the “story”.

We say we lost the states, which we did.

We’re sorry

We created a bug

The bug incorrectly compared values[quote=“slagle, post:1, topic:54460”]
When the integrity test compared the two queries, there was a possibility of the check failing.
[/quote]

When that happened, the state everywhere became corrupt [quote=“slagle, post:1, topic:54460”]
In instances where it failed, the states were merged incorrectly and corrupted the SmartApp state.
[/quote]

Bad states were written everywhere


Again, this wasn’t mean to explain what happened, but how it happened. Two totally different goals.

3 Likes

How can one effectively explain how something happened to an audience that doesn’t know what happened?

They are not only absolutely related, they are inextricably linked.

Then there is this from a previous post:

Those statements don’t mesh.

Either way, neither the status nor the incident report clearly identifies what remained broken and what users needed to fix on their own. No matter what else is said about it, I don’t think that’s acceptable. Especially if the stated goal, as you said, was to take full responsibility for the incident.

So basically, we cannot recover states from a previous night’s backup?

2 Likes

@slagle Thanks for all the efforts that go into this. There is no report that will make every one person happy, but this was a good effort, direct with admission of lost state.

We can all play armchair architects, but years of working on a large system with national usage and having had hiccups I see you guys going the right direction. Discrete, partial data recovery from backups sounds neat and futuristic when not knowing the architecture, but unfortunately the real world of a realtime system in which its state continues to change makes that a bigger challenge than most appreciate. As @KevinH said, customer data loss is always considered unacceptable, so its rare to see this much shared. I applaud the report efforts and candor.

Incidents happen, and I see progress year over year. Not as magically fast as folks want, and we all see the road bumps here as power users. However, I can’t say I am displeased with a service provided for a $99 fee, and supports a complex ecosystem from a multitude of vendors that simultaneously allows any developer to give a go at blasting the infrastructure. I bought lots of items to be at 160+ devices, and not a ton ST branded, so I am getting large value for what I’ve directly paid SmartThings. While hiccups annoy me, I find myself mostly happy (once I got off my butt to V2 and most everything locally processed possible).

Keep up the good fight, Tim.

8 Likes

@slagle

I’m still wondering why a state restore rebuild is not being jumped on as a high priority right now.

The ground work has been done… What’s stopping you?

Imagine, you could boast this… “as we grow as platform and a work wide system there may be bumps in the road, but we have a system in place that can help to prevent lost data”

3 Likes

And what about the other several dozen (or more) “high-priority” items? :confused:

I’m not joking, just curious about how you think SmartThings prioritizes efforts…

Agree that everything needs to be prioritized and there is always competition in that regard.

I think what would be helpful is seeing those priorities executed and marked off. ST has no obligation and will likely not be as transparent as one may like, however, something in that regard is called for because of where confidence stands.

When a ecosystem is healthy, moving forward and there is confidence - say Apple’s iPhone - there isn’t a need to share. But when you’ve lost the confidence of your clientele one way to rebuild is to become, more, transparent. I think that’s part of what Alex’s updates are supposed to be, for example…

Y’all just don’t give up on this request, eh? Sharing priorities means committing to completion within reasonable time frames.

1 Like

Speaking for Y’all, no. But that doesn’t matter, what does is the consumer market will never give up.

Consumer facing roadmaps are an important part of any product eco-system. They are critical in enterprise technology solutions, but they are also important in competitive consumer technology.

People and organizations need to make plans and inclusion of your product in those plans and/or retention of your product may very well depend upon your ability to develop and communicate a roadmap as well as the consumers confidence that you are capable of delivering the same. For example, perhaps a competitive solution has an edge with a feature the market desires and without knowledge that your organization plans to develop and deliver this desired feature, a consumer may plan to replace your solution. However, lifecycle forklifts are not easy and that consumer may choose not to plan the obsolescence of your solution if they know that feature or functionality is on your roadmap. Et Cetera

Mega Corps and small businesses alike are asked to deliver roadmaps and often their ultimate success is dependent upon being able to understand these dynamics and respond to them. So while I recognize it is a burden, it is not a unnecessary burden.

Roadmaps are nuanced. Short Term, Medium Term, Long Term for example. Short Term maybe a set of features you are working on currently with known delivery time frames. These time frames may or may not be made known, depending on the audience and the drivers to do so. Medium Term maybe next steps. Time frames are really rough here and may never be communicated externally, may be also severely restricted internally, items are subject to change. Long term items are pie in the sky, no time frames are even documented and the items are all but certain to change drastically.

Speaking for Mega Corp, we deliver roadmaps every day. We also issue written commitments. They can feel like traps. They can be difficult. Consumers can be demanding. This is how it is done. This is work. This is how a profitable company must operate to retain the confidence of the market. Try telling Wall Street you don’t have a roadmap.

Speaking for a Small Business, not-a-technology based product, one still has to commit to their consumers and the market some product delivery next steps. If not, consumers may go elsewhere or this may signal to potential competitors there is a opening and they may be tempted to deliver where you either cannot or it is perceived you may not.

TL;DR: Life is hard. Business is harder.

1 Like

Customers WILL go elsewhere if the company keeps loosing their data. Period.

1 Like

Yup… I appreciate where you’re coming from. SmartTiles is planning on having its own open feedback forum (like UserEcho, or similar), and we’ll be inclined to indicate which bugs and ideas we agree to address and which we will deny or defer indefinitely. Customers, however, can be pretty unforgiving if anything that looks like a “promise” is not kept, and they also can be prematurely or excessively, personally hurt and disappointed if their idea is turned down even for good business or architecture reasons.

SmartTiles might abandon the open feedback concept if it is too hard to manage. And I know that SmartThings is still trying to figure out the appropriate balance – especially a balance that they can commit to maintaining consistently.

1 Like

Sure … but it’s only happened twice in 6 months. Yes – that’s twice too often; but, compared to the frequency and impact of other issues, I can’t say I’d rate it as my #1 concern. I empathize with their need to figure out how to prioritize this and they can’t please everyone. No business does.

This sums it up quite well. We do care and we are striving to be better, but we’re still learning. We’re not perfect and don’t claim to be, but we’re striving towards that balance that works for us. We’re getting there. Just today, we’ve done two things we’ve never done in the name of transparency. I cannot express enough how much we appreciate every single one of you. You’re on this journey with us and we love you for that.

11 Likes

Hats off to you guys for putting up with our tantrums, like no other tech company I know. And for that we stick around. Especially you tslagle13. You may be @slagle now, but I can tell that your heart and soul stayed as @tslagle13, the guy who wants to help his fellow developers as much as he can.

7 Likes

Tim,
It’s frustrating to us. I believe you understand that completely. We are the customers… The nerdy, needy, ocd, demanding customers.

If was as bad as we make it out to be, we would have left a long time ago.

I’ve been with you guys for a year… I love it!

4 Likes

You know I am reading all of this, right?

You’re co-dependent. Seek help.

:joy:

2 Likes

I disagree… Alexa says I’m just fine.

1 Like

Ha, ha. I was actually speaking of myself, but if I would have used I, then you would have called me narcissist, so I used we instead.

1 Like

If not here, on this forum, where else would one go to get his daily slice of drama?! Assuming The Young and The Restless does not cut it?! :wink: just kidding, I don’t know any soap opera with that name.

2 Likes