Building a custom webhook handler: tips and suggestions

webhooks
howto
official

(Demitri Morgan) #1

What We Are Doing Here

We want to both create and update, with state changes, incident records in some third-party system, based on PagerDuty incidents and their state changes. To meet this need, we create a webhook extension on the service in question. In many cases there are readymade, vendor-specific extensions for the system in question.

However, for some there still aren’t, and this would especially be true for custom-built systems, or closed systems that do not expose any sort of public API for receiving data via HTTP. Hence, we’ll need to create an API, PHP/CGI script, AWS Lambda transform or other type of web service that will:

  1. Receive/validate the HTTP/POST requests sent per the webhook, and respond with status 200 OK
  2. Ingest/decode the JSON-encoded data (the body of the request)
  3. Extract and process the most useful and relevant data from the payload
  4. Perform updates and insertions into the third-party system

Getting a Sample Payload

To build and test your webhook receiver, it is very useful to first obtain and save samples of the webhook payloads. This will allow you to easily build test cases for your script, API, etc. so that iterative debugging does not have to involve manual steps in PagerDuty.

You can get good sample payloads by adding a second webhook that points to a RequestBin endpoint, stepping through trigger / ack / resolve of an incident, and then saving each of the webhook payloads.

It will also prove very illustrative to indent the payload using a tool such as jq or the command python -m json.tool, allowing you to more easily inspect its structure and decide upon how to access and process that data.

Identifying Incidents in Both Systems

If you have set the Incident Behavior settings on the service to Create Incidents Only, you will be able to use the deduplication key to both store internal-use, uniquely-identifying keys in the PagerDuty incidents (i.e. when creating them in PagerDuty from the third-party system) as well as to identify the target record for the state change when a webhook is received. The reason that this setting must be used is detailed in Alerts: Enabling Alerts and the following section “Alerts and bidirectional integrations”.

Setting the deduplication key

If no deduplication key is specified when triggering an incident, one will be generated automatically and returned in the response from the API, whether the REST API or Events API is in use. Hence, if there are no means to store custom metadata on an incident record in the third party system, but a unique identifier (i.e. primary key) is available on each record, the deduplication key should be set on a value derived from that property of data, so that given the key it is straightforward to identify the record to update. Vice-versa should be possible too, for identifying the incident in PagerDuty, if pushing state changes back to PagerDuty from your system.

In the Create Incident REST API endpoint, the property of the payload incident.incident_key sets the value (per the Request Schema specification).

In the v1 Events API, the property incident_key of the JSON-encoded object is used.

In the v2 Events API, the property dedup_key is used.

Obtaining the deduplication key from the webhook for identification

The deduplication key is a property of the incident object, which is included in the webhook’s message entries in the data property. Hence, let’s say that one wants to get (zero-indexed) the first message object in the webhook payload, and obtain the incident key. In JavaScript notation, the full namespace path to the key is:

payload.messages[0].data.incident_key

Another example: in PHP, let’s say $payload is the value returned from json_decode with the $assoc keyword argument set to false (the default), it would be:

$payload->messages[0]->data->incident_key

Looking Up Incidents in PagerDuty by Deduplication Key

If it is necessary to update the incident via REST API, or perform actions on related objects such as the service or escalation policy, the incident can be queried in the REST API using List Incidents, an index endpoint, with the parameters:

  • incident_key: the deduplication key, which ideally should identify only one incident, but doesn’t necessarily need to.
  • statuses[] (optional): a multivalued parameter that can be used to constrain the incidents returned to those that are open (triggered or acknowledged), if incident keys are ever re-used, so that only one or zero results are ever returned from the query, e.g. statuses%5B%5D=triggered&statuses%5B%5D=acknowledged

The result should be only one incident in the incidents array returned in the (JSON-encoded) response body. The total property of the response (the total number of matching records, per REST API Guide: Pagination) should be zero or one, depending on whether there’s an incident yet open for that key.

For instance, let’s say there’s a value in a system called “Foo” that uniquely identifies currently-active alerts, and its value is A38193, and our incident key naming convention is Foo-%s where %s is the key. The API endpoint to obtain the incident is then:

https://api.pagerduty.com/incidents?incident_key=Foo-A38193&statuses%5B%5D=triggered&statuses%5B%5D=acknowledged

Let’s say we’re working in JavaScript and the decoded response to a GET to the above URL is contained in an object called response. To obtain the incident object:

var incident = {};
if (response.total == 1) {
    incident = response.incidents[0];
}

Security Hardening

If you are hosting your custom code on a generic HTTP service, it is strongly suggested that you take measures to ensure that it can only be used for its intended purpose.

You can protect the receiving web service using an ACL; see Whitelisting IPs: Webhooks.

It is also recommended that you employ HTTPS and include a unique, random, difficult-to-guess parameter in the query portion of the webhook URL.


(system) #2