lists.arthurdejong.org
RSS feed

Re: nslcd crashing and leaving a pid file

[Date Prev][Date Next] [Thread Prev][Thread Next]

Re: nslcd crashing and leaving a pid file



On Wed, 2016-08-24 at 18:36 +0000, Dan Finn wrote:
> Sending this again in hopes that someone may see it, unfortunately I
> didn't get any responses last time.

Sorry to not reply sooner.

> All servers are CentOS 6.8 and running nss-pam-ldapd-0.7.5-
> 32.el6.x86_64.

That is a pretty old version. Red Hat added a number of patches so it
should be reasonably up-to-date with the latest 0.7 release though.

> Here's the output from a recent crash on a server with 2 uri entries:
> 
> > [root@ps-prod-app27 ~]# grep -i nslcd /var/log/messages|grep -v puppet
> Jul 29 03:05:57 ps-prod-app27 nslcd[1649]: [d6e997] no available LDAP server 
> found
> Jul 29 03:05:57 ps-prod-app27 nslcd[1649]: [993096] no available LDAP server 
> found
> Jul 29 03:05:57 ps-prod-app27 nslcd[1649]: [87eb60] no available LDAP server 
> found
> Jul 29 03:05:57 ps-prod-app27 nslcd[1649]: [66d96c] no available LDAP server 
> found
> Jul 29 03:05:57 ps-prod-app27 nslcd[1649]: [b2b065] no available LDAP server 
> found
> Jul 29 03:05:57 ps-prod-app27 nslcd[1649]: [188e1b] no available LDAP server 
> found
> Aug  1 17:08:24 ps-prod-app27 nslcd[1649]: [c81bd5] failed to bind to LDAP 
> server ldaps://ds-pdc.plansource.local/: Can't contact LDAP server: Transport 
> endpoint is not connected
> Aug  1 17:08:24 ps-prod-app27 nslcd[1649]: [c81bd5] no available LDAP server 
> found, sleeping 1 seconds
> Aug  1 21:02:25 ps-prod-app27 nslcd[1649]: [bf2217] failed to bind to LDAP 
> server ldaps://ds-pdc.plansource.local/: Can't contact LDAP server: Transport 
> endpoint is not connected
> Aug  1 21:02:25 ps-prod-app27 nslcd[1649]: [bf2217] no available LDAP server 
> found, sleeping 1 seconds
> Aug  1 21:08:41 ps-prod-app27 nslcd[1649]: [46a918] failed to bind to LDAP 
> server ldaps://ds-pdc.plansource.local/: Can't contact LDAP server: Transport 
> endpoint is not connected
> Aug  1 21:08:41 ps-prod-app27 nslcd[1649]: [46a918] no available LDAP server 
> found, sleeping 1 seconds
> Aug 10 17:04:03 ps-prod-app27 nslcd[1649]: caught signal SIGTERM (15), 
> shutting down

The last line is a normal shutdown but there is not much information in
the other logs. If it is a crash, can run run nslcd in debug mode (run
it as nslcd -d)? If you know what triggerred it that would probably
help. Certain program crashes are sometimes recorded in the kernel logs
so that may also provide more detailed information.

You could also run nslcd under valgrind or gdb to catch more detailed
crash information. That would help in tracking down the problem.

> And then shortly after the last log entry is when puppet starts
> complaining that it found nslcd dead but could not restart it.

If nslcd crashes the pid file will remain behind but it will not be
locked. This means that the simple presence of the pid file is not
sufficient to check whether nslcd is running or not.

The exit code of nslcd -c can be used in scripts to check whether nslcd
is running.

Hope this helps,

-- 
-- arthur - arthur@arthurdejong.org - https://arthurdejong.org/ --
-- 
To unsubscribe send an email to
nss-pam-ldapd-users-unsubscribe@lists.arthurdejong.org or see
http://lists.arthurdejong.org/nss-pam-ldapd-users/