Smartapps stopped working last night?


(Jeff L) #1

I have several smartapps that I use for home security. Last night, sadly at 2:30am I got up and went downstairs. I have a smartapp that disables my alarm if I trigger a certain motion sensor upstairs as the first “event”. The sensor triggered, and the event registered. By here’s what happened:

  1. My Mode Change smartapp was unable to change my system mode to disarm the alarm (or it ignored the motion, but I think the former). So my siren went off. Ugh.

  2. I manually shut down the siren from the android app - worked fine. Manually changed the system mode to “disarmed” - that worked as well.

  3. 15 minutes later - my security smartapp announces proudly that the alarm is “armed”. . .??? And the siren goes off again. So that Smartapp isn’t aware of the current system mode either at this point. I pulled the batteries from the siren at that point, my family is in an uproar. UGH.

  4. The alarm kept triggering (no siren) for the next hour while I poked around.

My Conclusion - My Smartapps were totally ignorant of my system mode last night between 2:00 and around 3:30. Even when I changed the mode manually my Alarm Smartapp didn’t notice. And my mode changer didn’t change the mode based on the motion when I first went downstairs. The dashboard worked. . .my Smartapps were clearly aware of events, e.g. triggering the alarm based on unexpected motion based on the mode it thought the system was in (even though it wasn’t).

I reloaded all of the apps, and they seem to work this morning - but my confidence is shaken. I travel a lot and if my wife had this happen while I was on the road it would have been a disaster. Has anyone else seen anything like this?

-Jeff


(ActionTiles.com co-founder Terry @ActionTiles; GitHub: @cosmicpuppy) #2

Hi Jeff,

Experiences like yours have us all worried: Even though the system wide “super big” outages have been rare, I do get enough noticeable slowdowns and misfires from time to time and read awful horror stories and that has kept me from installing a siren, specifically (among other “risky” items). I would certainly rip it off the wall and throw it in the freezer!

But here’s the thing: How can this be diagnosed so we can possibly understand the root causes as a Community? I think that understanding the causes is essential to trusting the solutions and fixes that SmartThings has deployed or plans to deploy.

In this case, the SmartThings status page shows no system incidents over last day / night: http://status.smartthings.com/

So either the status page has insufficient granularity to capture the problems that affected your environment (e.g., such as a problem in some section of the ST database or cloud servers or event handling), or the problem was actually isolated to just your environment – your entire set of SmartApps (since they all fire events that could conflict with each other as they modify mode or device states), your hub, your internet connection, your Z-Wave network, …

  1. Does “Tech Support” have further diagnostics for you (support@smartthings.com)?

  2. Have you checked the Event Logs (in the IDE web) for your hub and the affected devices? https://graph.api.smartthings.com/hub

  3. Have you tried keeping a browser window logged in and open to the Live Logging page? https://graph.api.smartthings.com/ide/logs

I think that currently the output of #2 and #3 are the most detailed diagnostics we can get from the user perspective … and I think that both streams are important as they don’t fully overlap. #2 seems to report all the normal events, and #3 will include all the “log.*” debugging, trace, and information messages that are enabled in the individual Device Handlers and SmartApps, and are filterable by the Device instance and SmartApp instance:


In other words, while it shouldn’t be a consumer level responsibility to run our own diagnostics, and I presume that ST Tech Support can also get at a lot of this information remotely when you open a support request; it could be very helpful for us power-users and developers to band together and do some live tracing, even at the most inopportune times like 2:30am…

That gives us the greatest range of diagnostic and solution opportunities:

  • Seeing if the problem might be due to some edge-condition bugs in the various in-house and community SmartDevice Types and SmartApps.

  • Narrowing down what features or methods have frequent inconsistent behavior: latency, misfires, sequencing, … ? Maybe the behavior is consistent with the specifications, but the documentation or example uses haven’t been accurate?

  • Learning the best ways (and requesting better ways…) to view the interaction of our concurrently installed and executing SmartApps and Device Handlers.

  • Identifying time periods where several people here experienced the same issues at the same time (i.e., “peak period” congestion, or system-wide but short-lived “hiccups”).

  • Determining what sort of problems are mostly likely due to individual’s internet connections (which will change improve radically upon installation of Hub V2), vs. those that are due to specific parts of ST’s platform (cloud capacity, database design, furtive bugs, …). In the latter case, the Community’s shared understanding of the issues gives us a unified voice in asking for the details of SmartThings’s resolution plans.

  • etc.


We seem to be living in a “Perpetual Beta World” (ummm… is gmail out of Beta yet?); partially because of the use of Continuous Integration Improvement. That comes with benefits and risks – and possibly responsibility … end-users have somehow been made responsible for knowing that every time they see a problem, they perhaps should be updating their apps, and every time they update or customize something (e.g., 3rd party SmartApps), something could break.

Still – I think despite the abundance of problem and complaint postings here, there is a tremendous amount of confidence and optimism in SmartThings too. My own confidence is proportionate to my understanding of how things are working (or not working), down to as much detail as possible, and thank folks like you for sharing the details of their experiences.

Wishing you a good night’s sleep in a secure home… And welcome your thoughts…
… CP / Terry.


(Joe) #3

I had problems last night also. I have a motion sensor in my daughters room to turn on the hue bulb outside her room when motion is detected. Last night I was working about 2AM central time and heard her start crying because she woke up and the hallway was completely dark. I went in there and tried to get my mass to set off the motion sensor, but still nothing. I rebooted my hub and Everything began working. The problem is that these things happen in the middle of the night, so who wants to take time to trouble shoot this? This problem has been becoming more and more frequent, so my family is becoming frustrated with the system. I hope they can get this under control soon.


(ActionTiles.com co-founder Terry @ActionTiles; GitHub: @cosmicpuppy) #4

I totally agree… but if the SmartThings status page http://status.smartthings.com, doesn’t show an outage, that indicates to me that SmartThings is unaware of an issue (or the issue falls outside the granularity of that data, or, ack, they are hiding outages and the page is disingenuous :speak_no_evil:).

So if “we” don’t report these things to support@smartthings.com ; at least the next morning (?), then there’s no way to tell if they are isolated incidents which could be fixed with minor tweaks, or if SmartThings is encountering some new condition that they are able to add to their monitoring / diagnostics system?

These could just be things that are “difficult” to remotely monitor without a user initiated request.

@Tyler (or designate): Can you tell us anything about what SmartThings is able to monitor and detect for the types of incidents last night that were experienced by John and Keo, above?


(Jeff L) #5

The only thing I can think of is that I did muck around with some apps yesterday, but not the ones that were causing problems. I was testing / publishing apps to help another user with lock control. I’ve had instances where things were mixed up after a lot of mucking around with apps. I should have tried rebooting it.

I did look at the logs and all events were registering. Mode changes were not, but I’m not clear if those show up normally.

My siren may be off for a while while I regain confidence.

-Jeff


(ActionTiles.com co-founder Terry @ActionTiles; GitHub: @cosmicpuppy) #6

May I ask what alarm SmartApp(s) you’re running?

There’s a recent new thread that mentions mode changes don’t sequence in as expected by some apps (ie, timing issues), but I thought I read some workarounds looked promising.

Inconsistent behavior is the worst, but I’m just thinking there is a multi-tasking / timing issue that maybe the alarm app is too optimistic about… And has just been lucky?


(Joe) #7

I opened a ticket after I got service restored. So they should have gotten something sometime early this morning. The most frustrating part is that I sold my wife on smartThings, and when it continues to have problems, she doesn’t want me spending more money on it. When I report these problems the majority of the time, I’m told to reboot my hub.


(ActionTiles.com co-founder Terry @ActionTiles; GitHub: @cosmicpuppy) #8

Do you think rebooting helps; or is it just their default answer for a first try?


(Jeff L) #9

I’m using Statusbits SmartAlarm. That’s the most robust alarm system I’ve seen after much poking around. The code is here, although I’m running 2.2.1 vs. the latest 2.2.5 - https://github.com/statusbits/smartalarm. At first I suspected a timing issue as well, and I have seen that in the past. Timing is important for how I disarm my alarm. Someone walks down the stairs from upstairs hitting the motion sensor at the top of the stairs, and my mode changer smartapp MUST trigger before the alarm smartapp. If an alarm is active, the mode changer will ignore any triggers (preventing someone from shutting off the alarm by simply walking upstairs). It’s been quite stable, but my wife told me there was an alarm event last week where my son set off the alarm. She thought that because he’s small he may not have triggered the motion sensor, but I doubt that very much. I was extremely careful with the placement. Anyways, bottom line, timing is critical.

For yesterday though - it was more than timing. I was manually switching the mode from “Armed Stay” to “Alarm Disarmed”. My dashboard showed it was working, but none of my apps seemed aware, including SmartAlarm.

I have to say, I do understand the challenge for Smartthings here. They have a complex infrastructure with folks like me (us) writing custom code to do things. . . .so many possible errors/issues that might have nothing to do with their infrastructure at all. But in this case, I really do think something went wrong on their end. Or my hub just got confused about things. . .but I consider the hub part of their infrastructure. I can’t write code or customize that.

-Jeff


(ActionTiles.com co-founder Terry @ActionTiles; GitHub: @cosmicpuppy) #10

So do you think that you are *encountering the same bug that is referred to by @geko in the latter few dozen posts here in this Topic? …


BTW:

As far as I know, the current Smart Hub does “run” very much at all … it is pretty much just responsible for being the bridge between your local ZigBee, Z-Wave, and LAN (WiFi) devices and the SmartCloud. So if there is an out-of-sync device status issue, scheduling issue, etc., that logic problem is in the SmartCloud. (Though this doesn’t help explain why hub reboots are found to remedy some problems…). :confused:

…CP / Terry.


(Geko) #11

FYI, I urge you to upgrade to 2.2.5. SmartThings made significant changes in their infrastructure last October/November which broke some apps, including Smart Alarm. Version 2.2.5 specifically addressed those issues.


(Jeff L) #12

Thanks @geko, I’ll upgrade per your suggestion. Ironically it’s been working perfectly up until the other night. . . .upgrading now!

-Jeff


(Michael Langkilde) #13

I, too, have had issues with ST. ST supposedly fixed the issues I was having still, with a few devices, it is not refreshing the way it use to on some device when I physically turn on/off the switch. When I hit the refresh button in ST for that device it then changes states and the smartapps work. So I think it is more the refresh issue then a smartapp issue. I am even using pollster and that does not seem to be working properly to help. It was working but the past couple days has not.


(Joe) #14

Do you think rebooting helps; or is it just their default answer for a first try?

I don’t know. I have noticed that the place I have my router seems to be warmer as of recent. I’ve read my router runs pretty warm, so I opened the doors to the closet I have it in. We’ve started to experience more connectivity issues than just smartThings, so I am hoping it’s just an airflow problem.


(Ron S) #15

Yes… Please make sure to open a ticket.


(Joe) #16

Hopefully my issues will be resolved today. This past weekend we had an intermittent service outage after multiple calls to time warner cable and loud voices (They frustrate me so much) They said they see something wrong with my “down stream”. So they come out today at 3:00 PM ish to hopefully resolve my problem. I hope this was why smartThings was acting up.


(Radiogavin) #17

I had hungry cats this morning.

For the record: the basement outlet triggers so that my cats get fed at midnight, 8AM, and 4PM. I see that the outlet turned on at midnight (EST), but never shut off at 12:15AM. I see another signal was sent to shut the feeder off at 8:15. This means the feeder never shut off after the 12AM feeding, so it couldn’t start properly at 8AM.

Bottom line: something went down last night. This is a sysyem-wide problem, clearly.


(Allan Cardinal) #18

My hub and app hasn’t worked properly for over a month now and no tech support. Keep on hearing we pushed an update but it gets worse. Should of went with a working model. I wouldn’t rely on it


(Ron S) #19

Did you reboot your hub? At times it is the best thing to do…


(Allan Cardinal) #20

Rebooted 4 times still no go. I say bad hub but st says don’t think so. Then must of been misled about what it’s supposed to do