Skip to main content

I found 1 tpoic on this site ironically enough that references an updated CA cert as possible issue but nothing else.  AFAICT the cert is fine.  When I add the repo (for which the GPG key expired that was in the documentation so I had to dig around for that) and install the pdagent and pdagent-integrations software packages they will not start. Status is as follows:

 

>root@dc-devops-nagios-xi tmp]# systemctl status pdagent
× pdagent.service - PagerDuty Agent
     Loaded: loaded (/etc/systemd/system/pdagent.service; enabled; preset: disabled)
     Active: failed (Result: exit-code) since Wed 2025-01-22 21:59:55 UTC; 2s ago
   Duration: 83ms
    Process: 1496654 ExecStartPre=/bin/mkdir -p /var/run/pdagent (code=exited, status=0/SUCCESS)
    Process: 1496655 ExecStartPre=/bin/chown -R pdagent:pdagent /var/run/pdagent (code=exited, status=0/SUCCESS)
    Process: 1496656 ExecStart=/usr/share/pdagent/bin/pdagentd -f (code=exited, status=1/FAILURE)
   Main PID: 1496656 (code=exited, status=1/FAILURE)
        CPU: 63ms

Jan 22 21:59:55 dc-devops-nagios-xi pdagentdd1496660]:   File "/usr/share/pdagent/bin/pdagentd.py", line 75, in <module>
Jan 22 21:59:55 dc-devops-nagios-xi pdagentd-1496660]:     import pdagent.config
Jan 22 21:59:55 dc-devops-nagios-xi pdagentdd1496660]: ModuleNotFoundError: No module named 'pdagent'
Jan 22 21:59:55 dc-devops-nagios-xi pdagentdd1496660]: During handling of the above exception, another exception occurred:
Jan 22 21:59:55 dc-devops-nagios-xi pdagentdd1496660]: Traceback (most recent call last):
Jan 22 21:59:55 dc-devops-nagios-xi pdagentdd1496660]:   File "/usr/share/pdagent/bin/pdagentd.py", line 81, in <module>
Jan 22 21:59:55 dc-devops-nagios-xi pdagentd 1496660]:     import pdagent.config
Jan 22 21:59:55 dc-devops-nagios-xi pdagentd51496660]: ModuleNotFoundError: No module named 'pdagent'
Jan 22 21:59:55 dc-devops-nagios-xi systemd:1]: pdagent.service: Main process exited, code=exited, status=1/FAILURE
Jan 22 21:59:55 dc-devops-nagios-xi systemd:1]: pdagent.service: Failed with result 'exit-code'.

I did some more googles and it appears that it may be related to the version of python on my RHEL9 system.  I have both python2 and 3 installed (long story) but even after editing the service file a bit to force it to use python2 it still gives the same error.  I tried installing via pip3 but that didnt work as well.  

I saw references to a transition to paython3 but it was almost a year ago and it said the python3 module hadn’t been finished.  I’m not married to python3 but I would like to get to this work if possible.  Clearly the directions on the oficial 2way integration doc are incorrect.

Has anyone gotten it working?  I can’t be the only one with the host OS/software combo?

Hi A3ch

 

I dont actually think pdagent will work on RHEL9  and the published bi-directional integration for PD probably doesnt work  due to various deprecations over the past few years

 

as far as I know most customers using Nagios.& PD only use a Nagios > PD integration and not full bidirectional 

 

 I have a guide which I will publish to my github sosn for full bi-directional integration between nagios-xi and PD which actually should work, it will require docker tho  so hopefully that would not be an issue for you

 

you can check out my bi-directional integration here

https://github.com/Nozlaf/PD2Nagiosv3

 

 

I will try and remember to post an update when I have put my guide on there

 

Cheers

 


Thanks for confirming I’m not crazy.  

Funny thing was I tried the older perl based webhook method in their guide for older OS’s and I couldn’t get that work either


perl one should work for sure, 

do you get any error with it if you execute it manually?

the way the perl (and Pdagent) send in is really old to be honest It would not shock me if PD disabled it soon, but the method I use using PDaltagent (https://github.com/martindstone/PDaltagent) is  pretty robust  as it uses event v2 format instead of the undocumented PD Nagios event format

 

 


I will fix up my documentation over the weekend  will be interested to see if mine works for you

 


@A3ch  I have done a first pass at writing the documentation I haven’t tested it yet, but I will be testing it later today hopefully

https://github.com/Nozlaf/PD2Nagiosv3/blob/main/updated_guide.md

 


in testing I found an issue with Martins docker-compose file which I have patched out in my repo, my new  documentation is far from finished but it works up to the point where Nagios sends to PD, then sending back from PD to Nagios is already covered on my existing documentation 

 

https://github.com/Nozlaf/PD2Nagiosv3/blob/main/README.md


hey guise.  Sorry I dropped off the face of the planet (got the flu).  One of my co-workers was able to get it to work.  I’m asking him for the details now (hes a bit of a wizard).  It didnt involve using your solution ​@nozlaf.  I’ll share as soon as I get the details 

 


“The python module was written for python2. pip installing it using python3 didn’t actually install anything. I moved the files from the python2 site-modules/ directory to the python3 site-modules/ directory and it works”

The thing I’m struggling with now is getting it to resolve the alert once the alert clears in Nagios.  It does it for the hosti s down hard but some of the custom alerts we have create the pagerduty alert, when I acknowledge it in Nagios and it acknowledges in Pagerduty but when the alert clears the pagerduty alert isnt resolving.  Unsure whats different for the host down hard alert vs some of our other ones.  Its been a few days so I’m tryiong to get my bearings.

Hope this helps anyone in our situation.  Thanks for the responses gents

 


Reply