We currently have a pretty simple setup where a PagerDuty service for application server is integrated with some Datadog monitors. However during investigation of one of the incidents I noticed that PagerDuty is processing the events coming from Datadog in an unexpected way for me that is causing some problems to us.
The PD service is configured with modern Incident Behavior setting
Create alerts and incidents: Will create an alert and then add it to a new incident. These incidents can be merged.
Datadog monitor is the usual for disk space monitoring - sending alert when there’s under 20% free and alert under 10%.
My expectation was that there’s going to be incident created, however each change would be listed as separate alert. Instead there’s just single incident with single alert. All the updates coming from Datadog are only listed in the alert timeline.
This combined together with the fact that Datadog is sending the events in format that isn’t understood by PagerDuty by default to set relevant Severity. The warning/alert information is only listed in field monitor_state as either
I have gone through both Datadog and PagerDuty documentation and there’s quite a lack of information on how the grouping based on Incident Behavior is working.
- Can I get more information on the inner workings of Incident Behavior service setting?
- Why are the events grouped together under single alert? Doesn’t it break the visibility of the responder in the main Incident?
- Is this caused by PagerDuty configuration or by how Datadog integration is using the PD API?
- Is it only our setup or would anybody else expect the Datadog integration to understand the severity by default?