lists.arthurdejong.org
RSS feed

Re: nslcd crashing and leaving a pid file

[Date Prev][Date Next] [Thread Prev][Thread Next]

Re: nslcd crashing and leaving a pid file



Sending this again in hopes that someone may see it, unfortunately I didn't get any responses last time.


I did make some changes shortly after this email was sent that seems to have helped but we still had a couple of servers recently where nslcd crashed and left a pid file.  I added a second uri entry pointing to another one of our LDAP servers.  As best as I can tell though nslcd is not failing over to this second server when it has issues, at least it's not mentioning it in the logs.  It only gives errors about the first uri listed in the config file before it crashes.  Here's the output from a recent crash on a server with 2 uri entries:


[root@ps-prod-app27 ~]# grep -i nslcd /var/log/messages|grep -v puppet

Jul 29 03:05:57 ps-prod-app27 nslcd[1649]: [d6e997] no available LDAP server found
Jul 29 03:05:57 ps-prod-app27 nslcd[1649]: [993096] no available LDAP server found
Jul 29 03:05:57 ps-prod-app27 nslcd[1649]: [87eb60] no available LDAP server found
Jul 29 03:05:57 ps-prod-app27 nslcd[1649]: [66d96c] no available LDAP server found
Jul 29 03:05:57 ps-prod-app27 nslcd[1649]: [b2b065] no available LDAP server found
Jul 29 03:05:57 ps-prod-app27 nslcd[1649]: [188e1b] no available LDAP server found
Aug  1 17:08:24 ps-prod-app27 nslcd[1649]: [c81bd5] failed to bind to LDAP server ldaps://ds-pdc.plansource.local/: Can't contact LDAP server: Transport endpoint is not connected
Aug  1 17:08:24 ps-prod-app27 nslcd[1649]: [c81bd5] no available LDAP server found, sleeping 1 seconds
Aug  1 21:02:25 ps-prod-app27 nslcd[1649]: [bf2217] failed to bind to LDAP server ldaps://ds-pdc.plansource.local/: Can't contact LDAP server: Transport endpoint is not connected
Aug  1 21:02:25 ps-prod-app27 nslcd[1649]: [bf2217] no available LDAP server found, sleeping 1 seconds
Aug  1 21:08:41 ps-prod-app27 nslcd[1649]: [46a918] failed to bind to LDAP server ldaps://ds-pdc.plansource.local/: Can't contact LDAP server: Transport endpoint is not connected
Aug  1 21:08:41 ps-prod-app27 nslcd[1649]: [46a918] no available LDAP server found, sleeping 1 seconds
Aug 10 17:04:03 ps-prod-app27 nslcd[1649]: caught signal SIGTERM (15), shutting down


It does seem like it dies well after the error messages are reported so maybe none of this is related but I don't have much else to go on at this point.  Any help would be much appreciated.


Thanks,

Dan


From: nss-pam-ldapd-users <nss-pam-ldapd-users-bounces+dfinn=plansource.com@lists.arthurdejong.org> on behalf of Dan Finn
Sent: Friday, August 5, 2016 11:47:58 AM
To: nss-pam-ldapd-users@lists.arthurdejong.org
Subject: nslcd crashing and leaving a pid file
 

We are seeing a somewhat frequent issue on some of our servers where nslcd crashes leaving a pid file and puppet is unable to restart it because of the existing pid file.  We have a mixed environment with CentOS and Ubuntu but this is only happening on the CentOS hosts as far as I can tell.  


All servers are CentOS 6.8 and running nss-pam-ldapd-0.7.5-32.el6.x86_64.


Prior to the crash we will see errors like so in the messages log:


Jul 29 03:06:01 ps-rc-util02 nslcd[17437]: [dfe7eb] no available LDAP server found
Jul 29 03:06:01 ps-rc-util02 nslcd[17437]: [bc31ad] no available LDAP server found
Jul 29 03:06:06 ps-rc-util02 nslcd[17437]: [3a6f48] no available LDAP server found
Jul 29 03:06:06 ps-rc-util02 nslcd[17437]: [875174] no available LDAP server found
Jul 29 03:06:06 ps-rc-util02 nslcd[17437]: [42afe5] no available LDAP server found
Jul 29 03:06:06 ps-rc-util02 nslcd[17437]: [d085f5] no available LDAP server found
Jul 29 03:06:06 ps-rc-util02 nslcd[17437]: [10ae59] no available LDAP server found
Jul 29 03:06:06 ps-rc-util02 nslcd[17437]: [9ab87e] no available LDAP server found
Jul 29 03:06:06 ps-rc-util02 nslcd[17437]: [6e3a1f] no available LDAP server found
Jul 29 03:06:12 ps-rc-util02 nslcd[17437]: [eafde2] failed to bind to LDAP server ldaps://ds-pdc.plansource.local/: Can't contact LDAP server: Connection timed out
Jul 29 03:06:12 ps-rc-util02 nslcd[17437]: [eafde2] no available LDAP server found
Aug  2 11:42:15 ps-rc-util02 nslcd[17437]: [db0739] ldap_result() failed: Can't contact LDAP server

And then shortly after the last log entry is when puppet starts complaining that it found nslcd dead but could not restart it.


I just spot checked some of our Ubuntu hosts and confirmed that there are no log entries at all like this on them.  All hosts are configured identically and to use the same LDAP server so this makes me think it's not so much an issue with the LDAP server but rather with something related to nslcd on the CentOS servers.


Based on some other issues I found while google'ing I have tried tuning the idle_timelimit down, it's currently set at 180 however lowering it has only seemed to help a little bit.


Here is our (sanitized) nslcd.conf file:


# /etc/nslcd.conf
# nslcd configuration file. See nslcd.conf(5)
# for details.

# The user and group nslcd should run as.
uid nslcd
gid nslcd

# disconnect after this amount of time (in seconds) of inactivity
idle_timelimit 180

# The location at which the LDAP server(s) should be reachable.
uri ldaps://ds-pdc.domain.local/

# The search base that will be used for all queries.
base dc=domain,dc=local
#base ou=People,dc=domain,dc=local

# The LDAP protocol version to use.
ldap_version 3

# The DN to bind with for normal lookups.
binddn CN=ldap,OU=Service Accounts,OU=IT,DC=domain,DC=local
bindpw *secret*

# The DN used for password modifications by root.
#rootpwmoddn cn=admin,dc=example,dc=com

# SSL options
ssl on
tls_reqcert never

# The search scope.
#scope sub

nss_initgroups_ignoreusers ALLLOCAL

filter passwd (&(&(objectClass=person)(uidNumber=*)))
#filter passwd (&(&(objectClass=person)(uidNumber=*))(unixHomeDirectory=*))
map    passwd uid              sAMAccountName
map    passwd homeDirectory    unixHomeDirectory
map    passwd gecos            displayName
# If you wish to override the shell given by LDAP, uncomment the next line
#map    passwd loginShell       "/bin/bash"
filter shadow (&(&(objectClass=person)(uidNumber=*)))
#filter shadow (&(&(objectClass=person)(uidNumber=*))(unixHomeDirectory=*))
map    shadow uid              sAMAccountName
map    shadow shadowLastChange pwdLastSet
filter group  (&(objectClass=group)(gidNumber=*))
#map    group  gid              member

Any help would be much appreciated.


Thanks,

Dan

-- 
To unsubscribe send an email to
nss-pam-ldapd-users-unsubscribe@lists.arthurdejong.org or see
http://lists.arthurdejong.org/nss-pam-ldapd-users/