IDE Live Logger Broken

@Ben @tslagle13 @April - FY attention

@tgauchat @garyd9 - FYR if you’re facing similar issue please chime in

so apart from the general IDE Live logging being slow there is a more serious issue, the message are arriving out of order.

I’ve been trying to debug a device and was always wondering why events are not happening the sequence the code had initiated. Now I see that the timestamps are out of order, this just happened so since it was a slow communication but faster communications you don’t even know it because of the resolution of the time stamps is 1 second.

Pretty serious issue as ordering communications is the most basic requirement/need and without a reliable debugger it’s pretty much useless to try and understand why things are happening out of order.

See the timestamps below.

8f2163b8-f600-4e7e-b0d4-49add68ada3b 1:21:03 PM: trace Push mode, checking for an status update in 15 seconds
8f2163b8-f600-4e7e-b0d4-49add68ada3b 1:21:01 PM: debug SMS: null, Push: true
8f2163b8-f600-4e7e-b0d4-49add68ada3b 1:21:01 PM: debug Enabling cruise Horizontal Cruise from CruiseMap1
8f2163b8-f600-4e7e-b0d4-49add68ada3b 1:21:01 PM: trace Cruise names → [Horizontal Cruise, Vertical Cruise]
8f2163b8-f600-4e7e-b0d4-49add68ada3b 1:21:01 PM: trace Getting cruise names from camera [Living Room Camera, Foscam 9810 Test]
8f2163b8-f600-4e7e-b0d4-49add68ada3b 1:21:00 PM: debug Taking a picture with camera Foscam 9810 Test
8f2163b8-f600-4e7e-b0d4-49add68ada3b 1:21:00 PM: debug Turning on lights [Breakfast Light]
8f2163b8-f600-4e7e-b0d4-49add68ada3b 1:21:00 PM: debug Turning on alarms null, silent: false
8f2163b8-f600-4e7e-b0d4-49add68ada3b 1:21:00 PM: info Motion detected in camera Foscam 9810 Test
8f2163b8-f600-4e7e-b0d4-49add68ada3b 1:21:02 PM: warn Received motion event from a camera Foscam 9810 Test, no cameras are being actively monitored
8f2163b8-f600-4e7e-b0d4-49add68ada3b 1:21:02 PM: trace stopMonitor called, stopping status updates for camera Foscam 9810 Test
8f2163b8-f600-4e7e-b0d4-49add68ada3b 1:21:01 PM: debug Registering Camera Motion Callback URL → http://tinyurl.com/otjrqxk
8f2163b8-f600-4e7e-b0d4-49add68ada3b 1:21:01 PM: trace Long URL: http://graph.api.smartthings.com/api/token/cf7eebaa-6122-40d9-9f1e-7e7ecae84cc1/smartapps/installations/8f2163b8-f600-4e7e-b0d4-49add68ada3b/Motion/79c8ac5d-045a-4b29-a744-15821d01f71c
8f2163b8-f600-4e7e-b0d4-49add68ada3b 1:21:01 PM: trace Short URL: http://tinyurl.com/otjrqxk
8f2163b8-f600-4e7e-b0d4-49add68ada3b 1:21:00 PM: trace Calling monitorTask

2 Likes

Should I even report this to ST Support?

Of course you should.

1 Like

Is this constant or off and on?

Constant, across devices and smartapps. The problem is bigger than it may seem, these are just the ones I can catch because they are on the second(s) boundary. The resolution isn’t small enough to see exactly HOW many of them are out of order and I’m guessing most of them would be if the IDE started logging at the ms level.

Here’s a device example:

79c8ac5d-045a-4b29-a744-15821d01f71c 2:30:45 PM: info Polled: Motion Alarm Callback Notification Enabled with URL http://tinyurl.com/otjrqxk
79c8ac5d-045a-4b29-a744-15821d01f71c 2:30:44 PM: trace Status: Alarm, Armed: Off
79c8ac5d-045a-4b29-a744-15821d01f71c 2:30:44 PM: info Motion Alarm Off
79c8ac5d-045a-4b29-a744-15821d01f71c 2:30:44 PM: trace Parse called
79c8ac5d-045a-4b29-a744-15821d01f71c 2:30:45 PM: trace Sending command → http://192.168.1.104:80/XXXX

Hi all,

I’m an Engineer on the Developer Tools team. I can empathize on the frustration of this issue. It has bothered me in the past as well. Our IDE logging is built on top of a distributed messaging system which does not guarantee the order of the messages. So from a system standpoint this is working as expected. However, we might be able to improve the user experience by injecting the logging messages in a chronological order in the web browser. It still might not be a perfect solution, as some messages have the same timestamp.

In response to this thread I have created a feature request ticket to investigate the previously mentioned solution. As with any feature request, it will be prioritized against all the other cool tools we are working hard to bring to the community to improve our developer experience.

As always, we appreciate the bug reports - so thank you for bringing this to our attention.

3 Likes

Thanks — correcting this at the browser level would be fabulous.

1 Like

And if you can add the Millisecond timestamp also it would be great!

1 Like

Ha! I’ve been adding ms timestamp but it takes several ms to do it :wink:

Second this.

2 Likes

@jeffblaisdell let me know once this has been done

Also I’m seeing issues with the IDE today, not sure if something changed
Sometimes when trying to save Smartapp code it gives me an error saying null pointed exception, retrying again a few times and it goes away.
When trying to open a SmartDevice code it says oh not something went wrong etc… try again and it goes away

Thanks for the info RBoy. I’m seeing same issue under my account.
We had a deployment earlier today, and are currently investigating.

I’ll ping back here when I know more.

One of the boxes in a cluster appeared to be non responsive after the deploy. We restarted it, and the errors seem to have stopped.

Are you still seeing the issue @RBoy?

Nope it’s okay now just that the IDE is VERY slow to save.

@jeffblaisdell FYI the events are still showing up out of order. I thought this issue was fixed:

403daa9b-0761-4c8f-ae75-21bada2dbb9a 10:09:05 PM: trace Checking expired updates for code 4 on Basement Door Lock
403daa9b-0761-4c8f-ae75-21bada2dbb9a 10:09:05 PM: trace Checking expired updates for code 2 on Basement Door Lock
403daa9b-0761-4c8f-ae75-21bada2dbb9a 10:09:05 PM: trace Checking expired updates for code 6 on Basement Door Lock

It’s a simple loop checking, it this should be going 1…6 but I see 1,3,5,6,2,4

Can you add a 2 second delay to the loop. I am guessing the resolution is not granular enough to order items that execute within 1 second. To test this add at least a second delay to the loop. I would bet your events will then return in order. The events are being executed in order but ST needs to be more granular with the time stamps used for sorting.

The ordering issue is not something we’ve gotten around to updating. It is an improvement that we will add, but it is a matter of priority / timing at the moment.

The execution is happening in the proper order. However the events describing the execution, flow through a distributed messaging system which does not guarantee order. Essentially it is a performance tradeoff to be able to handle the heavy load of all the events flowing into the system from every connected device.

To @jody.albritton 's point, the timestamp precision would also need to increase in order to get more accurate sorting.

1 Like

@jeffblaisdell FYI Live Logging isn’t working, as in it’s complete dead no messages from anything whatsoever.
Any idea?

It’s back up now…

@jeffblaisdell

The live logging is broken again, it’s missing debug events. The same issue posted here:

the app runs I KNOW it’s running because things are happening but in between some debug messages are going missing. I see an event happen because it’s in a simple loop from 1 to 10 but suddenly in between event 4 happens but the debug message saying event 4 happen doesn’t print. 1 to 3 and 5 to 10 print, 4 happens but doesn’t print and it’s random! Sometimes all 10 will print sometimes a few will disappear in between.

Something is wrong here.

2 Likes

It’s pretty bad - I’m losing messages randomly - wondering if one of the servers in the cluster is down or out of sync