Cloud-Centric Architecture. What's the Problem?

geko · April 14, 2016, 5:05pm

Continuing the discussion from Uproar? Really?:

I have no doubt that Ben and everyone else at ST genuinely strive to achieve platform stability and make SmartThings a successful product. However, being an active user for two years, I’ve heard that many times. There’s been three platform “meltdowns” in two years and every time the prescribed treatment is the same - add more server resources, database clusters, etc. Surely enough, 6 month later the same problems re-appear and the cycle repeats itself.

Which, of course, begs the question whether the current cloud-centric architecture is ever going to work? This question has been raised more than once and the answer has always been “Yes, we’re sure it’s going to work, just wait and see.” So, we’ve waited, sometimes patiently, sometimes not. But, I think that two years is a long enough time to make conclusive determination about viability of any particular technology.

It seems clear to me that the current implementation of the cloud platform has failed. Now, for the record, I’m not “anti-cloud”. I’ve been working in the IoT field for many years and know first-hand both advantages and pitfalls of the cloud. But there are different ways to use cloud technology. The social network cloud model is not the same as the device cloud model and therefore cloud technologies that work well for Facebook do not necessarily work equally well for SmartThings.

In the social network, users constantly interact with each other, while in the device cloud each user is isolated form everyone else. I don’t want my events to trigger actions in other user’s account and vise versa. Ever. Therefore, there’s no good reason to store all event from all users in one (or several) massive database(s) which will inevitably run into scalability issues. Each user should have it’s own database, scheduler, event engine, etc. running in a virtual box, completely isolated from other users. This would ensure that whatever problem that might occur within one user’s account will be isolated and not affect other users. In the worst case, the virtual box can be restarted and no one will ever notice. It will also allow for better security and better performance.

One of the start-up’s chief mantras is “Fail Fast”. It’s ok to fail, if you quickly realize that you’re on the wrong path and can pivot quickly. Unfortunately, it appears to me that SmartThings technical leadership have been stuck on the current cloud-centric architecture that have been failing for way too long.

Just my opinion, FWIW.

tgauchat · April 14, 2016, 6:51pm

Great way to revisit this topic. (That’s not sarcasm…).

I’ve got a “DevOps” background (and actually make personal use of virtual machines for various projects and purposes…), so I definitely understand the concept of what you’re recommending…

But I have no experience in situations where a company has to run and maintain ~300,000 such virtual boxes (VM’s = Virtual Machines), all kept online (i.e., in memory, not cached off to disk) concurrently 24x7.

I’m asking … well, what do you think the disadvantages are to this approach?

My few guesses, and you or folks can suggest others, including @hagins (CTO) and other engineers that @slagle and @jody.albritton may be able to bring into the conversation…

The resource cost in the cloud is likely tremendous. This is definitely just a guess, because I don’t know how cloud pricing works for VMs, or Docker Containers, or whatever mechanism would be used. It just seems to me that having a lot of mostly idle VMs out there holding on to RAM would be a few magnitudes more costly than the shared memory / database concept currently in use. Even if most of the VMs were not idle, completely isolated VMs lose the benefit of shared executable images of published SmartApps and Device Type Handlers. We all run over the same small set of copies now, instead of each needing it’s own copy.
The maintenance cost (i.e., distribution of platform updates, database optimization processes) is also likely tremendous. In the current SmartThings cloud architecture, server “clusters” are used, but they all share identical images of the platform software … maybe some dedicated to different duties (different parts of the database vs. messaging?). While automated, of course, updating ~300,000 VMs is still, even in this day and age, much harder and less efficient than updating a homogeneous cluster… I think?
SmartThings is already moving towards “a separate box for each user” … well, mostly. That box being the Hub V2 (and beyond) … i.e., local processing on each users’s very own distinct “box”. Not a virtual box, but a real box. Sure… we know that’s a ways off, unfortunately, and a lot of processing will always be in the cloud; but perhaps if everything that “can be local is made local” … then we get all the benefit of the individual VMs you recommend, with all the complexity of the second bullet, but none of the cost of my first bullet. Not to mention the added benefit of lower network latency.

geko · April 14, 2016, 7:20pm

HA automation is fairly lightweight application. It’s easy to estimate a load. Even if you have a hundred devices you’re not likely to generate more than a few dozen events per second. Response time requirements are also quite moderate - 10’s and 100’s of millisecond. A single-processor VM instance with 512M RAM can easily handle this workload. Retail price of such VM is $5 per month, which is not tremendous by any measure.

The only resource-intensive application is video streaming and would require a more powerful VM which could be offered for additional service charge, which is the case with SmartThings anyway.

SmartThings is already moving towards “a separate box for each user” … well, mostly. That box being the Hub V2 (and beyond) … i.e., local processing on each users’s very own distinct “box”. Not a virtual box, but a real box.

That strategy obviously didn’t work and they had to backpedal and scale back the expectations.

tgauchat · April 14, 2016, 7:33pm

Relative to $0.00 income per hub, even $2-$3 per month per hub, would be too high of an expense, I think.

SmartThings is already revenue negative (until they have a lot more scaling), given pretty low gross margins, then add in the cost of Customer Support (currently not scaling) and all the other overhead.

So, I’d conclude that cost of hosting is a major consideration, unless SmartThings adds directly offsetting ongoing user fees (i.e., a $5/month fee … like PEQ’s $10/month fee; and the monthly fees that most competitors have).

geko · April 14, 2016, 7:41pm

Obviously, I don’t know what they pay for hosting currently, but it’s probably in hundreds of thousands of dollars per month range, so the cost wouldn’t be that much different. And if they cannot recover their expenses one way or another, then their business model is wrong and they’re going to fail sooner or later anyway.

Dianoga · April 14, 2016, 8:05pm

The cost would be dramatically different. I’m not sure what exactly our hosting costs are currently, but it is definitely less than $2/month/user.

geko · April 14, 2016, 8:11pm

Thank you for the insight. However, I do believe that the cost is not the major obstacle here. Many ST users said that they’d gladly pay a reasonable monthly fee if the service were reliable. Considering that an average user has 12 devices, as we now know, it wouldn’t be unreasonable to offer a free service for up to 10 devices (to get you started) and charge $5 per month for every 100 devices after that. Just an idea.

tgauchat · April 14, 2016, 8:37pm

Perhaps engineering and/or operations might support that idea; but now go and try to convince the marketing and product management departments. Though even convincing the former to give up on the notion of minimizing the cost of scalability (i.e., by sharing resources as much as possible but not beyond the breaking point…) would be a challenge. This is not something that can come from the bottom up.

TL;DR: – A platform build using a dedicated cloud VM per Hub would cost much more than even a hypothetically optimal scaled shared architecture. SmartThings is opposed to such a significant difference in hosting costs, since even minimal ongoing revenue streams have yet to materialize (especially if there are influential architects that don’t believe that such expensive hosting options are “necessary”).

blebson · April 14, 2016, 8:39pm

I would say that this type of subscription based service for standard features (i.e. just using your devices normally) is a HUGE turn-off to a lot of general consumers when you have other options on the market without that barrier to entry. It would push the hub into a separate tier of HA solutions which don’t nearly have the popularity of the one-time-cost hubs and don’t have the reliability or power of the professional hubs.

geko · April 14, 2016, 9:26pm

Obviously, everyone has different needs and requirements. I don’t like to pay monthly fees any more as the next guy, but I have no problem paying $8 for Netflix because it has value to me and is reasonably priced. And apparently so do over 40 million of Netflix subscribers in US alone.

Speaking only for myself, I would be open to a subscription fee for a personal cloud service if it were 1) 99.9% reliable and 2) ensure privacy and security of my data. I wouldn’t pay $40 that Xfinity is asking because it exceeds perceived value of the service to me. But paying something like $5 a month for a 100 devices per month sounds perfectly reasonable to me.

Based on my estimate, this fee would be more than sufficient to cover SmartThings operating expenses. Iris uses similar tiered pricing which seems to work for them, but the problem with Iris is that just like SmartThings, it’s a “shared cloud” and therefore suffers from the same scalability and reliability issues.

blebson · April 14, 2016, 9:36pm

There is still a large group of consumers who see a subscription service as a non-starter. Iris also seemingly isn’t doing nearly as well as SmartThings even though they have entire sections of physical stores devoted to them all over the US. The only online community I found in a quick search hasn’t had a new post in it for over 3 weeks…

geko · April 14, 2016, 9:47pm

That’s why it’s essential to offer a free entry-level service. Not feature-limited, like Iris does, but rather restricted by the number of devices.

The only online community I found in a quick search hasn’t had a new post in it for over 3 weeks…

Iris (along with Wink and Staples) does not need Community for two reasons:

It’s not a development platform.
It has a full-scale live telephone customer support: 1-855-469-IRIS (4747)

From the consumer (as opposed to developer) point of view, it’s a better proposition than seeking advise from a community.

blebson · April 14, 2016, 9:54pm

I can understand shying away from a core-feature-limited subscription, but in this case wouldn’t using your devices be a core feature? I would put things like Video Cloud storage, monitoring services, etc as being non-core subscriptions which ST is already implementing.

geko · April 14, 2016, 10:02pm

No, I don’t think so. An average SmartThings customer has only 12 devices, meaning that most people just buy a starer kit and maybe add just a few more devices. For these folks the service would essentially remain free and unrestricted in any way (unless they opt for video streaming). The rest of us, who have a lot of devices and would require more than a basic single core VM to run them, would have to pay a monthly fee for a “premium” VM.

blebson · April 14, 2016, 10:04pm

If they started the company with that methodology I could see it working but at this point they have both explicitly stated that they wouldn’t charge for previously released feature on top of the fact that there would likely be a large scale revolt. I really don’t think SmartThings or Samsung want that type of exposure.

Sniper · April 14, 2016, 10:25pm

Personally, I hope ST build something similar to Rule Machine or maybe if it comes back allow offline mode, not even sure if thats possible.

Feels like Cloud should be something optional or least work without it fully, so local processing should come first. I think current ST apps already work offline? Not sure current hardware could allow vast amounts of offline apps etc but maybe a later version can.

I think SmartThings could serve as a simple proxy for anyone wanting online access, they can still log events/collect stats e.g average device ownership, which devices/apps are popular, data mining etc.

Right now not sure I’d be willing to pay a monthly fee but if everything could work offline first sure I’d consider a small fee (probably based on device count) so I could communicate over the internet, just don’t like the idea of rules/functionality breaking because either ST has gone down or you no longer have internet access for some reason.

What you also have consider is every powered device adds to the electricity cost. Right now for me it only adds £1 per month having all the bulbs/devices on standby, this might improve in future devices with lower standby costs hopefully! I could easily see that increasing to £5 month or even more for some users.

geko · April 14, 2016, 11:05pm

This is exactly what I’m talking about. None of these “band-aids” would be necessary should the cloud portion of SmartThing ran on a personal VM rather than shared cloud.

A Message from Alex on Platform Improvements and Our Plan Forward Announcements

First, I want to let you know that everyone at SmartThings is fully aware of the issues that have been affecting platform reliability. In the past few weeks, we’ve redoubled our efforts to make some fundamental improvements that will soon be felt by all of you in your everyday experience with SmartThings. Know that these improvements are only the beginning, that we are in this for the long term, and that we are absolutely committed to building the best, most open platform in the world. It’s just scratching the surface, but I’d like to take you through the technical details of some of the changes we’ve implemented recently: Smart Home Monitor We’ve made a few changes to Smart Home Monitor that should help to resolve issues with data consistency, load time, and Arming/Disarming. The fir…

JDRoberts · April 18, 2016, 6:12am

Except the market says otherwise.

There are over two million accounts on iControl services who pay upwards of $30/month. (ADT Pulse, xfinity home, time warner Intellihome, Rogers in Canada, and a few more)

And of course Lowe’s Iris is $10/month if you want their premium service.

You add those together and it’s way more people than SmartThings plus Wink plus Vera.

They are paying for convenience and reliability. They’re giving up flexibility and versatility and quite a few features.

There’s still room in the market for both business models. (Icontrol bought piper just so they would have something to offer without a subscription.)

But as of now there are more people willing to pay a subscription, not less. But they may be two completely separate groups, it’s hard to tell how much overlap there is.

blebson · April 18, 2016, 6:35am

I suppose I didn’t really lump those under “Home Automation” because they’re more add-on services for either Cable/Internet or Conventional Home Security. But you are correct, that business model is becoming very popular and is probably how most people are exposed to “Home Automation”.

JH1 · April 18, 2016, 1:24pm

I am not inclined towards product as a service models.

Further, given the track record here - if there is going to be a re-architecture effort there at least needs to be a “private cloud” option whereas you can take that off hub processing in house.

Some say I am a dreamer.

Topic		Replies	Views
Why the cloud? General Discussion	23	6680	April 23, 2016
[Poll] SmartThings Community! Would you help SmartThings help us? General Discussion	32	1176	February 4, 2021
Forced to a subscription fee? General Discussion subscription	64	16421	April 18, 2016
Call To Action: No New Features, Stability First General Discussion	59	5626	February 27, 2016
Super Expensive! Connected Things	58	4362	September 25, 2013

Cloud-Centric Architecture. What's the Problem?

Related topics