CoRE SmartApp Changes - 1/5/17

My ST CoRE Pistons were working super slow yesterday around 6:30-9ish Eastern PM. Seems to be working fine today.

I started noticing issues after I installed 5 different instances of CoRE after reading how this would speed the UI up. So far, I’ve been building Pistons and re-building left and right and what use to work is not failing occasionally.

I should have left well enough alone…

We’re working on getting things back to low latency. Sorry guys.

2 Likes

Thanks. I was curious because other than slow, things are NOT failing for me.

EDIT: So it does seem to be hitting CoRE more than anything else.

Bravo guys, thanks for the heads up and proactive thinking - it’s very much appreciated!

Thank you!

Things are much better this morning. Still saw a couple timeouts over about a 10 minute window of watching logs. I would call that progress! Thanks!

1 Like

Is there a post or status page we can follow to get updates on these issues? The ST platform status site seems to not have this on it.

A decision (not necessarily saying it was the right/wrong one) was made to not create an incident for the this issue. While we don’t have hard criteria defined for what constitutes a status update on the status page it generally comes down to:

  1. Severity of the problem (is a component down/are all queries failing/are some queries failing/are all requests slow/are some requests slow/etc…)
  2. Reach (how many users, devices, smartapps are affected?)
  3. Impact (How disruptive is the issue to the user?)

In this case the severity & reach of the issue were limited - a single table in a single database and a small subset of Users/SmartApps were/are affected but for the few of you that were affected it was very disruptive. First of all…I would like to apologize - I should have responded to yours and the others comments a couple of days ago instead of just leaving you hanging wondering when your system would start to function again. Sorry - I will try to not to let that happen in the future.

Moving forward - we have put in 2 changes to help mitigate the issue. A big one yesterday at 16:30 CDT and another minimal one at 16:00 CDT today. My expectation at this point is that your system is performing much better than it was earlier this week but you will still probably see some timeouts during executions. To actually solve this problem and get to a state where we don’ t see any timeouts, we need to rebuild the database that is affected (then do additional forensics work on it). To do this in a way that prevents data loss and prevent any downtime - we are working through a process that should be finished by Monday night or Tuesday morning. I’ll be sure to provide an update when this work is complete or if we run into any snags.

11 Likes

Posted in core Release candidate, but might be the update?

Blank dashboard in core.

Latest build.
Tried experimental, and Classic dashboard.
Enabled logging.
Tried expert mode and without.
Deleted all pistons and created a test one ”basic”
Deleted core from ST, and created it again.
Did numerous Outh reset.
Tried to log out and in again in the ST app.
The FAQ says I should be able to find some clue due to a ”corrupt” piston, but the log gives no errors.
Tried on android and apple, tried opening the dashboard in a browser on a pc.

It is blank no matter what I do, and I can’t find the error.
I am all out of options

Copy the dashboard url and open it in chrome, keep the development console open (f12 on windows) and check the Console tab for any errors. Also check the live logs for errors…

1 Like

Thx, found errors in console.
Dont know what it is thoug.
No error in live log.

Vlad,

Where are you at with the fixes? Routines, even simple ones are failing or only partially executing. This is going on almost 2 weeks. I’m starting to take a lot of heat from the family again because it’s starting to feel like Lowe’s Iris all over again.

Tonight the family walked in and had the alarm go off and lights flash, despite getting a push message the the I’m back had just run. People tend to really dislike being scared by the very system that’s there to protect them. I find myself once again being in a defensive position with SmartThings and my family, and that’s unacceptable any longer.

If this is just a toy, or hobby system then the marketing team needs to update the website.

EDIT: The java.util.concurrent.TimeoutException errors are back and worse then ever on my system.

1 Like

I am having the same issues again. It has been fine the last 2 days and now again no lights work!!! Fix the system or refund me for the thousands I have invested.

1 Like

As much as I was happy about the proactive communications and steps that were taken…

When I got back yesterday from out of town I find several CoRE pistons that are not working as they were previous to whatever caused this. I had already rebuilt the pistons using the “Rebuild All Pistons” option after the upgrade/changes were made by ST.

I have 32 pistons, many with several functions - probably in excess of 100 automations.

So for that ones I have noticed to be broken I am going in and checking and modifying something - sometimes I notice although the overview screen for that piston lists the operand properly - when I go into the individual evaluation it does not have the operand defined - I will define it, modify something else, put it back and save it.

I haven’t tested yet if this will work/fix the issues.

So here is where this leaves me…

  1. Defining 100+ Automations, and appropriate testing scenarios so I can then go physically test each one.
  2. Fixing each broken automation.

I don’t know how to do 2 yet… because I don’t know what’s broken or how to fix it.

Really freaking frustrating to have to go through and to do this yet again ST. I’ve done this several times and there is no acceptable excuse for me to have to do this again. I have other sh*t to do.

Once again, your automation is only effectively automating your customers.

The fix has just been finished - shouldn’t see anymore timeouts.

1 Like

things are running faster now.

Things improved here as well. We’ll see how the scheduled automations do.

Thanks, I don’t see anything in logs now. Will this fix the malfunctioning routines too?

Looks like things are much better now.

Well tonight I can’t control most of my lights. I have a virtual switch that won’t work either

Refreshed everything in the IDE, rebooted the hub… So frustrated

Submitted a ticket to support

Anyone else?
Rick

1 Like

I’ve had a REALLY strange night here.

I haven’t noticed any major issues.