Weekly Update from Alex - 05/19/16

You were right. I checked in our internal thread and found that we had a partial device_conn outage from 9:13 — 9:43 am PT where 8 of our 12 device_conn nodes lost their queues, leading to partial impacts to some users. It’s resolved now and I also drilled into why the status page wasn’t updated in real time. It turns out that it was because the team that page wasn’t informed in this case until after the partial outage was already resolved. We’ll learn from that and improve.

The status page has been updated now. I want that to be an absolutely trustworthy information source for everyone.

13 Likes