Zabbix events not reaching PagerDuty

zabbix
questions

(Akanksha Jain) #1

I am trying to integrate PagerDuty with Zabbix. I am able to send test even to PagerDuty using pd-zabbix I can see action logs where actions are being triggered from zabbix server and the status column has sent . However they are not reaching PagerDuty. I checked the service key multiple times but still I cannot see the event om PagerDuty. I have checked all the steps in troubleshooting section. What else can I look for debugging.


(Demitri Morgan) #2

Hi Akanksha,

It sounds like you may have to do some network troubleshooting if you’re running the integration script manually and the events still aren’t getting through to PagerDuty.

Also, FYI, an easy mistake to make is that the case is incorrect in the event action, i.e. it must be all lowercase. So, trigger and not Trigger or TRIGGER. Be sure to check /var/lib/pdagent/outqueue/err for events that the Events API rejected for being malformed. You can examine the contents of the files in there to make sure they comply with the definition of the expected event schema (see: Events API (v1) Documentation)

Just to be clear, do the alerts from Zabbix show up in the PagerDuty Agent’s logs? If so, please read on. Otherwise, you may have missed a step in troubleshooting or the Integration Guide.

If PagerDuty Agent is running in a network that requires all outbound traffic go through a proxy server, please have a look at the following post:

Finally, before getting into network troubleshooting, look in /var/log/pdagent/pdagentd-debug.log for hints as to the nature of the problem. That could save you a lot of time; verbose info about errors that the agent runs into when trying to send data to PagerDuty will be printed out in there.

Assuming you still haven’t found the cause of the issue at this point, let’s start at the lower level of the OSI model and work our way up, assuming that physical and data link layers are taken care of already.

First, as a sanity check, make sure the Events API hostname can be resolved:

nslookup events.pagerduty.com

If you see NXDOMAIN in the output, you’ve got a local DNS resolution issue.

Next, check network using this command:

ping -q -c 3 events.pagerduty.com

Expected output should look something like this (IP address and latency may vary):

$ ping -q -c 3 events.pagerduty.com
PING events.gslb.pagerduty.com (54.245.112.46): 56 data bytes

--- events.gslb.pagerduty.com ping statistics ---
3 packets transmitted, 3 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 24.255/24.305/24.405/0.070 ms

If the summary says 100.0% packet loss or the like, then it’s a network connectivity issue.

Next, test transport using this command:

nc -v -w 1 events.pagerduty.com 443

That should print out:

Connection to events.pagerduty.com port 443 [tcp/https] succeeded!

Otherwise, you need to check your local network’s firewall/ACL to make sure it allows outbound connections to remote hosts on port 443 (TCP/HTTPS).

Next, check to be sure the Events API’s TLS certificate is trusted locally, by trying this:

openssl s_client -host events.pagerduty.com -port 443

That should print out certificate info preceded by a few lines that will look something like this:

CONNECTED(00000003)
depth=2 /C=US/O=GeoTrust Inc./CN=GeoTrust Global CA

Good luck!


(Akanksha Jain) #3

Thanks a lot Demitri. It was because port 443 was not allowing traffic.


(Demitri Morgan) #4

Awesome! Glad that helped you narrow it down.


(system) #5