SmartApp code gets a TimeoutException *only* when run part of a scheduled job

I have a SmartApp …

… whose initialize method …

def initialize() {
if (clearState) {
// state.clear // this doesn’t work. why? this does:
(state.keySet() as String).each {state.remove it}
}
switches .each {name, title → logState settings.get(name)}
switchLevels.each {name, title → logState settings.get(name)}
[‘sunrise’, ‘sunset’].each {
subscribe location , it , respondToSun}
if (On && Off) [‘On’, ‘Off’].each {name →
schedule settings.get(name) , ‘respondToSchedule’ + name}
contactSensors.each {name, title →
subscribe settings.get(name) , ‘contact’ , respondToContact}
illuminanceMeasurements.each {name, title →
subscribe settings.get(name) , ‘illuminance’ , respondToIlluminance}
motionSensors.each {name, title →
subscribe settings.get(name) , ‘motion’ , respondToMotion}
respond()
}

… arranges to respondTo my location’s sunrise/sunset, my background lighting On/Off schedule, contact from contactSensors, illuminance from illuminanceMeasurements and motion from motionSensors. initialize then responds to the current state. Future events are all handled in the same manner …

def respondToSun(EventWrapper e) {
log.info indent + “:sunny: $e.name $e.location”
respond()
}

def respondToScheduleOn() {
log.info indent + “:alarm_clock: on schedule”
respond()
}

def respondToScheduleOff() {
log.info indent + “:alarm_clock: off schedule”
respond()
}

def respondToContact(EventWrapper e) {
log.info indent + “:zap: $e.value $e.name $e.device”
respond()
}

def respondToIlluminance(EventWrapper e) {
log.info indent + “☼ $e.value $e.name $e.device”
respond()
}

def respondToMotion(EventWrapper e) {
log.info indent + “:soccer: $e.value $e.name $e.device”
respond()
}

… which is to log an informational message and respond to the new current state.

The execution time of respond

// respond according to current state
def respond() {
Date now = new Date()
TimeZone timeZone = location.getTimeZone()

// consider the sun if we should not ignore it and
// now is within interval [sunrise, sunset)
def sunriseAndSunset = getSunriseAndSunset()
boolean sun = !ignoreSun && timeOfDayIsBetween(
    sunriseAndSunset.sunrise,
    new Date(sunriseAndSunset.sunset.getTime() - 1),
    now, timeZone)

// consider retaining a background lighting level (as opposed to off)
// if we are ignoring the on/off schedule, we don't have one or
// now is within the [On, Off) interval
boolean background = ignoreSchedule || !(On && Off) || timeOfDayIsBetween(
    nextOccurrence(On),
    new Date(nextOccurrence(Off).getTime() - 1),
    now, timeZone)

setLevel DWSL, findValue(
    {valueIf 0  , {sun}},
    {valueIf 0  , {findBrighter(64, DWLIM, DWRIM) && !ignoreIlluminance}},
    {valueIf 100, {findMotion       DWLMS, DWRMS}},
    {valueIf 100, {findOpen         GDCS}},
    {valueIf 50 , {findMotion       WWMS}},
    {valueIf 25 , {findMotion       FPMS}},
    {valueIf 25 , {findOpen         FDCS}},
    {valueIf 5  , {background}},
)
// etc ...

}

… will vary depending on the state remembered from previous setLevel calls, the current state of the sun, the current timeOfDay in my background lighting On/Off schedule, and the current state of my sensors.

The problem is that respond execution time also seems to depend on whether it was invoked from respondToScheduleOn, respondToScheduleOff or not. When responding to exactly the same state, a respondToScheduleOn or respondToScheduleOff execution will fail (early, after setting the level of only one light) with a TimeoutException. When this execution is followed immediately by saving (unchanged) settings from the SmartThings mobile app, initialize is called to respond to exactly the same state and this works!

Can you tell whether it is respond() itself taking longer? I was wondering if there could be some improbably expensive housekeeping going on to setup the next timed schedule before it gets around to actually running the handler and respond (). If there was the handler itself could perhaps schedule respond() to run a second or two later using runIn(), assuming runIn() is more efficient.

From the SmartThings mobile app, I just reconfigured my SmartApp to reschedule the background lighting Off time to 9:03 (a short time in the future). This is what I saw a little later from Live Logging.

9:03:19 PM: error java.util.concurrent.TimeoutException: Execution time exceeded 20 app execution seconds: 123535796422033 @line -1 (findValue)
9:03:18 PM: info :black_circle: ← 5 backYard:Light
9:02:54 PM: info :alarm_clock: off schedule

The first off schedule line comes almost a minute late and is reported at the beginning of my respondToScheduleOff handler. respond is called immediately after this. The first thing that respond does is to determine the value of the backYard:Light from the current state. It calls setValue with this device and value. This value is different than the one remembered previously so setValue modifies the device, remembers the new state and logs the backYard:Light line 24 seconds later! Then the TimeoutException.

Compare that to this:

9:17:47 PM: info :black_circle: ← 5 ARTIK sidePorch:DoorLight
9:17:47 PM: info :black_circle: ← 30 patio:Wall Light
9:17:47 PM: info :black_circle: ← 10 ARTIK patio:DoorLight
9:17:46 PM: info :black_circle: ← 10 ARTIK frontPorch:DoorLight
9:17:45 PM: info :black_circle: ← 5 ARTIK driveway:Light
9:17:44 PM: info :black_circle: ← 5 backYard:Light
9:17:43 PM: info :white_circle: 5 ARTIK sidePorch:DoorLight
9:17:43 PM: info :white_circle: 30 patio:Wall Light
9:17:43 PM: info :white_circle: 10 ARTIK patio:DoorLight
9:17:43 PM: info :black_circle: 100 ARTIK garage:CeilingLight
9:17:43 PM: info :white_circle: 10 ARTIK frontPorch:DoorLight
9:17:43 PM: info :white_circle: 5 ARTIK driveway:Light
9:17:43 PM: debug updated with settings: [clearState:false, ignoreIlluminance:false, ignoreSchedule:false, ignoreSun:false, On:2018-05-09T05:00:00.000-0700, Off:2018-11-03T21:00:00.000-0700, FDCS:frontPorch:Door Sensor, GDCS:garage:Door Sensor, PDCS:patio:Door Sensor, SPCS:sidePorch:Door Sensor, DWNIM:driveway:North Motion/Light Sensor, FPIM:frontPorch:Motion/Light Sensor, DWNMS:driveway:North Motion/Light Sensor, FPMS:frontPorch:Motion/Light Sensor, SPMS:sidePorch:Motion/Light Sensor, WWMS:frontYard:Walkway Motion Sensor, BYSL:backYard:Light, DWSL:ARTIK driveway:Light, EWSL:ARTIK entryway:DoorLight, FPSL:ARTIK frontPorch:DoorLight, GCSL:ARTIK garage:CeilingLight, PDSL:ARTIK patio:DoorLight, PWSL:patio:Wall Light, SPSL:ARTIK sidePorch:DoorLight, DWSMS:driveway:North Motion/Light Sensor, DWSIM:driveway:North Motion/Light Sensor]

Prior to this, I used the SmartThings mobile app to change my Off time to 10:00. This turned on all the lights - exactly as things were before the first log. Instead of waiting for 10:00 to schedule my respondToScheduleOff (like above), I used the mobile app to change the Off time to 9:00. The state change is the same (on to off) but the trigger is different (updated instead of respondToScheduleOff). This results in a call to my updated method, which calls initialize, which calls respond. The debug updated with settings line comes from updated. The next 6 lines come from initialize and happen immediately. The next is for the backYard:Light as above except here it comes only one second after the trigger (updated) instead of 24!. respond finishes setting the value of its last light only 4 seconds later. Success instead of failure!

The only difference between these two is that the first is kicked off with a schedule event and the second is kicked off from a mobile app reconfigure (updated). The second actually performs more work but takes much less time. This makes absolutely no sense to me.

Another contrast to the TimeoutException received as part of my respondToScheduleOff processing is my respondToSunrise processing. This runs the same code without any problem. It handles the same state change (on to off). Again, the only difference is the trigger.