Skip to main content
Event
Wed, Sep 24, 9:00 AM - 9:30 AM (UTC)

[DevOpsDays London] Tackling Alert Fatigue with SLOs, Automation, and Machine Learning

About this event

SLOs allow teams to prioritize the more impactful or important aspects of their services. Metrics that center the user experience gives teams focus and goals. Highlighting the metrics that matter most gives teams space to disable alerts that contribute nothing to overall customer happiness.

Automation gives us more hands to deal with issues without taking time away from more interesting work. Machine Learning tools group alerts together and help add context when things are noisy and distracting.

These tools combined help teams tackle a common incident response problem - alert fatigue. Full Service Ownership teams looking to improve the quality of their services can experiment with their SLIs and SLOs to find the work that will benefit their systems the most while also preserving their own sanity. Prevent the stress and anxiety that can arise from unexpected system failures by setting clear expectations and allowing for planned responses to potential issues based on pre-agreed norms.

Mandi Walls is a DevOps Advocate on the Community and Advocacy Team at PagerDuty. Before joining PagerDuty, Mandi spent a number of years at Chef Software, working with customers and community members in the US and Europe. Originally a large-scale systems administrator, Mandi has focused on IT automation; organizational culture and change; and community.

Event details
In-person event
Wed, Sep 24, 9:00 AM - 9:30 AM (UTC)
Location
Sir Alexander Fleming building at Imperial College London