Early SRE Ethnographic Research

As a UX researcher, one of my jobs is to observe users to find certain problematic patterns in their behavior. My goal for identifying these patterns is to try and tease out the cause of the problem. As I mentioned in my previous blog, I am learning about this new...

Happiness is a measured user journey

We mess with Jenkins configs.  We struggle with failing automated tests.  We finally get Kubernetes doing what we want it to do.  We wake up at 2am to restart a failing API gateway.  Why?  So the user gets the best experience possible.  How do we know if our efforts...

Wind-up Top Monitoring

Greetings from Cabot; my name is Nate and I’ve been in the application performance and availability space for 15+ years.  Though looking back, one could say I’ve been working in this space as far back as my first bona fide corporate job as a college intern for...

SREs, who are you?

My name is Kyoko. I am a user researcher for Cabot. The nature of my job is to learn, understand and sympathize with others - specifically, users.  I often meet very interesting people as ‘users,’ from general consumer to people in very specific technical field. In...

SRE Pain Points

I want to spend a little time reflecting on the Cabot journey over the past few months.  As I mentioned at the outset of this blog, many years ago, I was touched by how much Lakshay’s life was impacted by his job.  Of course, everyone’s job impacts their life, but...

Abnormal vs. Bad

Is your solution detecting actual business threats? Reflecting on the alert fatigue problem, I think a lot of the problem comes down to conflating abnormal metric values with bad user experiences.  Many monitoring products reinforce the confusion by making it easy...

Alert Fatigue

One of the issues that I’ve run across over the years is alert fatigue.  As the linked article points out, it’s not just a problem for SREs, but we’re definitely victims of it.  I can’t count the number of times the question, “Hey, what is that alert about?” is...

Welcome to Cabot

My name is Mark and I’ve been in the site reliability game for a while now - going on about fifteen years. A lot has changed since I was a fresh-faced consultant joining Wily Technology back in 2000; the rise of AWS, Docker, APM, Nagios, Slack, and on and on. The one...