SRE Pain Points

I want to spend a little time reflecting on the Cabot journey over the past few months.  As I mentioned at the outset of this blog, many years ago, I was touched by how much Lakshay’s life was impacted by his job.  Of course, everyone’s job impacts their life, but...

Abnormal vs. Bad

Is your solution detecting actual business threats? Reflecting on the alert fatigue problem, I think a lot of the problem comes down to conflating abnormal metric values with bad user experiences.  Many monitoring products reinforce the confusion by making it easy...

Alert Fatigue

One of the issues that I’ve run across over the years is alert fatigue.  As the linked article points out, it’s not just a problem for SREs, but we’re definitely victims of it.  I can’t count the number of times the question, “Hey, what is that alert about?” is...