Enhancing Postmortems

Postmortems, while a great feature of PagerDuty is a bit lacking, here are a few things that could make this a better feature; please, if anyone has a solution to this post it here or if you have other ideas on how to improve on it, same thing, post a comment.

  • Assign Severity - Seeing the severity of the incident(s) the postmortem corresponds to.
  • Assign Team or Service - Assign a postmortem to a team or service.
  • Filtering Postmortems - With the two above items, the ability to filter postmortems by a team, severity or just any item in the table headers on the postmortem page would be a huge win. This should work the same was as other areas that already use Teams to limit what is seen across the platform.
  • Attachments - Would be nice to be able to attach files to a postmortem to avoid having people go to different tools to get information related to the incident the pm is for.
  • Tagging additional Responders and Stakeholders - The ability to assign others to a postmortem instead of just listing the incident commander (owner).
  • Transition Status - Together with tagging, the ability to transition a postmortem through the different status and being able to assign someone for each status.
  • Task Assignment - Having the ability to assign tasks with an expected completion date.
1 Like
  • Support in the API for [at least] retrieving and aggregating Postmortems.
  • Rich text support.
  • Pagination on the Postmortems view so that large quantities don’t take 10+ seconds to load.
1 Like

Francisco & Ryan,

Ajit from the Product team here at PagerDuty.

Thanks for your suggestions on how our Postmortems feature could be improved.

Let me address each point in turn:

  • Assign Severity - Seeing the severity of the incident(s) the postmortem corresponds to.

Have noted this gap for Severity.

For customers using Incident Priority, a separate Timeline entry is created when Priority is set, so this is somewhat visible now.

  • Assign Team or Service - Assign a postmortem to a team or service.

In the second half of the year, we’ll be investing in making Postmortems respect Teams and Permissions more comprehensively. Check in with us in September to see how that work is lining up and if there’s an opportunity for you to join our Preview (Beta) program.

  • Filtering Postmortems - With the two above items, the ability to filter postmortems by a team, severity or just any item in the table headers on the postmortem page would be a huge win. This should work the same was as other areas that already use Teams to limit what is seen across the platform.

Noted, this is an area we’re looking to improve while we tackle the Teams/Permissions item cited above.

  • Attachments - Would be nice to be able to attach files to a postmortem to avoid having people go to different tools to get information related to the incident the pm is for.

What attachment types would you expect Postmortems to support?

At present, inserting URLs into the relevant Postmortem sections could mitigate some of the different tool pain you mention.

  • Tagging additional Responders and Stakeholders - The ability to assign others to a postmortem instead of just listing the incident commander (owner).

Happy to hear more about this: can you clarify what use case you’re looking to achieve?

  • Transition Status - Together with tagging, the ability to transition a postmortem through the different status and being able to assign someone for each status.

You can change the ownership of the postmortem as it transitions through the various states (Draft, In Review, Reviewed, etc) so we’re already capable of supporting most of this ask.

  • Task Assignment - Having the ability to assign tasks with an expected completion date.

This is something we’ve heard, often in the context of exporting postmortem action items to ticketing tools like JIRA. Would exposing an export capability, instead of building a task manager
work for you?

  • Support in the API for [at least] retrieving and aggregating Postmortems.

Exposing an API for Postmortems is a longer-term roadmap item, unfortunately not something we’d get to in 2019.

  • Rich text support.

Noted.

  • Pagination on the Postmortems view so that large quantities don’t take 10+ seconds to load.

Noted and something we may be able to tackle in 2019 as part of other Postmortem work.

Thanks,
–Ajit

1 Like

Ajit,

Thanks for the reply. Some notes:

While this is true, there is no way to tell that there is a high severity postmortem from the postmortems page. Currently, we’d need to know which Postmortems are for a high severity incident, manually open each postmortem to check or put the severity in the name, this could work but it would be better to have it as a field that can then be used in a filter.

This would be mainly for images, csv or txt files to start with should suffice. We can attach items in a slack channel but then it only imports the timestamp and nothing that links back to the attachment. That would be a viable alternative in the meantime if we can get that working.

In the Postmortem, it would be good to have the ability to tag additional responders that worked on the incident without having to either go through the entire postmortem or go back into the incident to see who all worked on the incident.

While this would be an intermediary way to handle this we would like to keep the ownership to the “Incident Commander” and then have each stage assigned to different people.

I think exporting could work this if for example, we can say that a JIRA ticket get’s created for an assignment and we can track it in the Postmortem.

Francisco,

Thanks for your follow up.

Have noted the importance of being able to distinguish postmortems by incident priority: agreed that there are some current workarounds–put priority into postmortem title–but a distinct, searchable and sortable field would be best.

Have noted your other comments as well, either for our team’s use or passing comments along to other PMs that are responsible for items outside our team’s scope.

Stay tuned!

Thanks,
–Ajit

Hello All, I think the postmortem feature is a good start but needs a lot of work.

I prefer to think of the postmortem document as just one part of a process to undertake after the resolution of an incident. To that case, the postmortem document must contain significant information gleaned from the incident. To that extent here is the information I gather in our postmortem document and I will indicate where I feel information should be pre-populated on creation. Ultimately, the goal is to automate as much as possible whether that be the postmortem form, communications and metrics tracking (A whole other area that needs concentration on)

RCA #: Pre-Populated and overwritable
Occurrence Date: Pre-Populated with incident occurrence date
Ticket #: Pre-Populated with the incident ticket number from ticketing system (Jira, Service Now, etc…)
Outage Start and Stop Time: Pre-populated
Time Length of Incident: Calculated from previous field
Impacted Environment: A fillable field (Potentially pre-populated) Example: The Victim of the incident)

SUMMARY OF WHAT HAPPENED SECTION

  • Impact on Customer (High, Medium, Low)
  • Number of Customers Impacted (Fill-in Field)
  • Number of Customer Calls (Fill-in Field)
  • Assigned Severity (Pre-Populated from PagerDuty)
  • Culprit: Based on selection impacted environment can be provided a list created by customer)
  • What ultimately resolved the incident: (Text Field)

SHARED INFORMATION

  • External shared communications (Text Field)
  • Internal shared communications (Text field)
    Must be able to add multiple times to each of above for multiple communications to indicated channel)

INCIDENT RESPONSE

How was the incident detected (pre-populated with information from pagerduty and overwritable)
Did we have a monitor or metric that showed the incident (Yes/No) and ability to add images to show the monitor alert

Who was involved: Pre-Populated by PagerDuty
Incident Owner: Pre-Populated by PagerDuty
Incident Manager: Pre-Populated by PagerDuty

RESPONSE PERFORMANCE
Pre-populated by PagerDuty where data available and over-writable.

  • Time to identify
  • Time to escalation
  • Time of investigation
  • Time to resolution

TIMELINE OF EVENTS
Populated with data from PagerDuty and can add/subtract and insert information

We also use these fields so naturally just can add into postmortem form

  • What went well
  • What didn’t go so well
  • What really concerned us or was afraid of during the incident
  • What did we learn of the nuances of the system behavior that isn’t documented AND easily retrievable.
  • Were there any sources of data about the systems (logs, graphs, etc.) that staff dismissed or were suspicious of.

HOW DID IT HAPPEN (5-Whys)
Why
Why
Why
Why
Why

HOW CAN WE IMPROVE

ACTION ITEMS #1,2,3,4

Agreed that PM are only one piece of the puzzle. I was focusing more on the missing pieces as the PM feature currently works.

I do like that the sections are highly customizable and each customer can make the PM template into what they need from it and still have the ability to add sections if needed for a special case incident where more information would be needed.

We don’t use impact metrics at the moment so not sure how that works with PMs but it would be awesome to be able to pull those metrics into a PostMortem if they are not accessible this way.

David & Francisco:

Appreciate the continued comments here.

I’ve noted the suggestions around having pre-populated yet overwritable fields. Certainly this approach strikes a good balance between PagerDuty providing initial content, while providing the user with the ability to use that content as-is, or modify or delete as desired.

Thanks,
–Ajit

1 Like

a good balance between PagerDuty providing initial content, while providing the user with the ability to use that content as-is, or modify or delete as desired.

That’s where I’m going with this. We’d want PagerDuty to be the hub for the information regarding an incident as well as a reference section for who’s doing what to remediate the issue and ensure the cause is fixed. From there you can take that info or the pieces from the PM and put that into business reports and such. But it’s important that the PM platform in PD is capable of being that Incident repository.

Regarding the “HUB” comment. Here is my posting regarding such topic as Incident Commander Panel for use during Major Incidents. It is a feature I’m desperately wanting. Major Incident Commanders Panel

Thanks Francisco, having PagerDuty play a larger and larger role as the source of truth for incidents and their response is a key goal as we mature our offering.

Our challenge is deciding how much information needs to be contained within PagerDuty (including building out new areas of the product) versus integrated or linked with systems our customers have already invested in.

Appreciate the dialogue as always!

Thanks,
–Ajit

Appreciate this writeup David, will reply in that thread.

Thanks,
–Ajit

Another item to add to the Enhancements suggestions:

  • The ability to customize the “Status” field OR an addition to the “Status” field of “Ready for Review”

The reasoning behind this is that there’s no way to specify with this field that the “Draft” is complete and can be assigned to an employee for advancing to “In Review”. Our company is currently witnessing behavior of people setting Slack reminders to…remind them to remind people to review their PM. That’s a productivity gap!

Ideally this could then be generated in a report/monitor so we can enumerate which reports need a review completed, but that won’t be possible (without screen scraping I guess?) if PagerDuty doesn’t expose a PM API :frowning:

1 Like

Right, and in trying to map those pieces I think that being able to build proper options to import things from one tool into a PostMortem (images as in my example).

I think and not sure if PagerDuty has done this but a study of how PostMortems are used across your customers as the challenges they face with being able to utilize PagerDuty from more of an Incident Management Platform.

=-=-=-=-==-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

As for attachments, it would be nice to at minimum be able to import the link to the attachment from the integrated tools. Like Slack for example, each attachment uploaded to slack has an assigned URL that one can share.

Ps… If PagerDuty is working towards becoming THE Platform where businesses can have an actual Incident Management Platform; there will be more and more things that will need to be hosted in PagerDuty as well as things customers will expect to be able to get out of PagerDuty as well. Proper PostMortems being one of them.

Francisco & Ryan - thank you both for your thoughtful feedback about what you would take advantage of in an expanded postmortems capability. We will make sure you’re on the list for follow-up and discussion when we take on improvements to postmortems.

Cheers,
Paul

1 Like

Thanks for the info and for keeping us in mind Paul.

Question, I’m out next week but will try this once I get back; but I figured Id ask so I don’t waste time testing something that can’t be done.

Maybe this would be a good feature request… PM = PostMortem

When an incident is active that would need a PM write up, can you have the PM up with the incidents and slack channels tagged in the PM and have a live view of those endpoints from in the PM as the incident is being worked on or do you have to wait until the incident is resolved for you to be able to see it all in the PM?

This I thin would be an awesome feature as the Incident Commander could tag the relevant endpoints to the PM and just gather the relevant data and writeup the postmortem as the incident is happening instead of trying to remember things later on.