Follow-the-sun embargo schedule


(Andrew Evans) #1

What we have:

Embargoed alerts fire during “business hours” for Eastern time zone (UTC -5), meaning 7am -> 11pm . They ring L1, then L2 (both on weekly rotation), then the rest of the team

Alerts 11pm-7am ET will wait until the next 7am ET to fire

For me, on Pacific time, that means low-priority alerts will fire at 4am - 8pm PT, and I will get woken up for them. This is not ideal

What we want:

Embargo’d alerts fire & escalate for L1, L2, and the rest of the team, but everyone who is not in business hours (7am - 11pm in their individual time zone ) is filtered out of the list.

For example:

When an alert comes in, check if L1 is in business hours (in L1’s time zone). If so, alert L1

If L1 is not in business hours, but L2 is in business hours (for L2’s time zone), alert L2

If neither L1 nor L2 is in business hours, but someone on the team is (in their individual time zones), alert them

If alert is not acked after 30min, escalate to next person who is in business hours (for their individual time zone)

If no one is in business hours, hold the alert until the next time L1 or L2 is in business hours (for their respective time zone) and then alert

Is there any way to do this? We’ve done a ton of tinkering with services, schedules & escalation policies, but I can’t figure out a way to manage this. Thanks!

(Chris Dryden) #2

Hey Andrew,

Do you want to notify the users who should be on-call?

Schedules should be able to do that for you.

Just to confirm two things, you are using schedules for users/groups of users and not just putting them in the escalation policy, correct?

The mentioned schedules are correctly configured to put users on-call during their business hours?

We do have a couple of knowledge base articles on Schedule concepts.

(Simon Fiddaman) #3

Hi @atevans-mapbox,

As @cdryden said - Schedules should be able to fix this. You can limit the Schedules for L1 and L2 to be timezone and time period aware.

Check out my post Follow-the-Sun Schedules - Option 2: I want consistent start-and-end times in all of my regions.

Note that If you don’t have anyone scheduled at that time the Alert/Incident will be suppressed - you should test that the Alert will be unsuppressed when someone wakes up (there’s a “Raise the Priority” flag which may help with this).

Hope that helps,

(Andrew Evans) #4

Hmm, I’m not sure I’m communicating this correctly. Here’s an example of what we have now:

This is the L1 business-hours schedule. There is a very similar schedule a week behind to handle L2.

I could add a new layer for Pacific Time users, with 10am -> 2am Eastern Time (ET) blocks. @simonfiddaman , I think that lines up with your regional-schedule suggestion? But then it would just be me in that layer, so I would be scheduled as on-call every week rather than following our 6-week rotation. What we want is to have me on-call during 10am-2am ET (7am-11pm PT), but only during my assigned week of rotation.

If we added the ET users to the new layer, so that it keeps the same rotation as the existing layer, then the ET folks would be on-call at 2am their time.

We could create a new schedule for the “shared” hours (10am -> 11pm ET), and then separate schedules for the “non-shared” hours (7am -> 10am ET, and 11pm -> 2am ET), but I’m not sure how to get the rotations to line up correctly. By necessity, it would be 6 users on the “shared office hours” schedule, 5 on the “ET morning” and just me on the “PT night” schedules.

We could create 5 fake users, and put them on the non-shared schedules to pad them out and make the rotations line up between the 3 schedules. But I think that would cost us more money.

Is there a way to do this, or something simple I’m missing? Or maybe I’m overthinking this, and we should abandon this different-hours-same-week approach. I’d be grateful for any suggestions, best practices, or personal experiences. Thanks!

(Simon Fiddaman) #5

Hi @atevans-mapbox,

Yeah, based on what you’re saying I would abandon the different hours spread - I can’t think of any good way to make it work, or actually any way to make it work as you’d like (where the schedule hours are different when you are the On-call).

We could create 5 fake users, and put them on the non-shared schedules to pad them out and make the rotations line up between the 3 schedules. But I think that would cost us more money.

You could likely create just one fake user (it would be a paid user, and of no other value) and repeat them in your schedule - and add them in your place in the “ET morning” schedule. Note that that fake user will receive the alerts (i.e. they will not be suppressed, and the Escalation Policy timer will be ticking down).

OK, the only other way I can think of making this work is much more convoluted - by adjusting the Service “Support Hours” - basically moving them back and forwards to match your ideal schedule hours, depending on who is on-call. I’d imagine a script which polls the active on-call, and if it’s someone determined as being in your TZ, move the Support Hours ah… backwards(?) to suit. Once it’s back to an ET person, shift them all forwards(?) again. Your Schedules should be expanded to fit all possibilities (or just 24x7) and you must have Support Hours configured on all Services which use those Schedules/EPs. Yeah, that is truly horrible, but it may work.

We don’t really have the same issues - where we have a daytime “IRQ” or “Sheriff”, they only operate within their timezone, and we use different Services (bound to different Teams) to assign these so they aren’t NOC-visible during the daytime. Outside those hours (for IRQ teams) we assign alerts to a different Service (there’s a script which queries the two Schedules to determine who is On-call, and assigns to the active Service); for Sheriff teams we use the Support Hours to suppress the alerts or force them to be Low Urgency (re-raising in the morning).

Hope that helps,

(system) #6

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.