Re: nslcd crashing and leaving a pid file
[Date Prev][Date Next] [Thread Prev][Thread Next]Re: nslcd crashing and leaving a pid file
- From: Dan Finn <Dan.Finn [at] plansource.com>
- To: "nss-pam-ldapd-users [at] lists.arthurdejong.org" <nss-pam-ldapd-users [at] lists.arthurdejong.org>
- Subject: Re: nslcd crashing and leaving a pid file
- Date: Wed, 24 Aug 2016 18:36:22 +0000
Sending this again in hopes that someone may see it, unfortunately I didn't get any responses last time.
I did make some changes shortly after this email was sent that seems to have helped but we still had a couple of servers recently where nslcd crashed and left a pid file. I added a second uri entry pointing to another one of our LDAP servers. As best as I can tell though nslcd is not failing over to this second server when it has issues, at least it's not mentioning it in the logs. It only gives errors about the first uri listed in the config file before it crashes. Here's the output from a recent crash on a server with 2 uri entries:
[root@ps-prod-app27 ~]# grep -i nslcd /var/log/messages|grep -v puppet
Jul 29 03:05:57 ps-prod-app27 nslcd[1649]: [d6e997] no available LDAP server found
Jul 29 03:05:57 ps-prod-app27 nslcd[1649]: [993096] no available LDAP server found
Jul 29 03:05:57 ps-prod-app27 nslcd[1649]: [87eb60] no available LDAP server found
Jul 29 03:05:57 ps-prod-app27 nslcd[1649]: [66d96c] no available LDAP server found
Jul 29 03:05:57 ps-prod-app27 nslcd[1649]: [b2b065] no available LDAP server found
Jul 29 03:05:57 ps-prod-app27 nslcd[1649]: [188e1b] no available LDAP server found
Aug 1 17:08:24 ps-prod-app27 nslcd[1649]: [c81bd5] failed to bind to LDAP server ldaps://ds-pdc.plansource.local/: Can't contact LDAP server: Transport endpoint is not connected
Aug 1 17:08:24 ps-prod-app27 nslcd[1649]: [c81bd5] no available LDAP server found, sleeping 1 seconds
Aug 1 21:02:25 ps-prod-app27 nslcd[1649]: [bf2217] failed to bind to LDAP server ldaps://ds-pdc.plansource.local/: Can't contact LDAP server: Transport endpoint is not connected
Aug 1 21:02:25 ps-prod-app27 nslcd[1649]: [bf2217] no available LDAP server found, sleeping 1 seconds
Aug 1 21:08:41 ps-prod-app27 nslcd[1649]: [46a918] failed to bind to LDAP server ldaps://ds-pdc.plansource.local/: Can't contact LDAP server: Transport endpoint is not connected
Aug 1 21:08:41 ps-prod-app27 nslcd[1649]: [46a918] no available LDAP server found, sleeping 1 seconds
Aug 10 17:04:03 ps-prod-app27 nslcd[1649]: caught signal SIGTERM (15), shutting down
It does seem like it dies well after the error messages are reported so maybe none of this is related but I don't have much else to go on at this point. Any help would be much appreciated.
Thanks, Dan From: nss-pam-ldapd-users <nss-pam-ldapd-users-bounces+dfinn=plansource.com@lists.arthurdejong.org> on behalf of Dan Finn
Sent: Friday, August 5, 2016 11:47:58 AM To: nss-pam-ldapd-users@lists.arthurdejong.org Subject: nslcd crashing and leaving a pid file We are seeing a somewhat frequent issue on some of our servers where nslcd crashes leaving a pid file and puppet is unable to restart it because of the existing pid file. We have a mixed environment with CentOS and Ubuntu but this is only happening on the CentOS hosts as far as I can tell.
All servers are CentOS 6.8 and running nss-pam-ldapd-0.7.5-32.el6.x86_64.
Prior to the crash we will see errors like so in the messages log:
Jul 29 03:06:01 ps-rc-util02 nslcd[17437]: [dfe7eb] no available LDAP server found
Jul 29 03:06:01 ps-rc-util02 nslcd[17437]: [bc31ad] no available LDAP server found
Jul 29 03:06:06 ps-rc-util02 nslcd[17437]: [3a6f48] no available LDAP server found
Jul 29 03:06:06 ps-rc-util02 nslcd[17437]: [875174] no available LDAP server found
Jul 29 03:06:06 ps-rc-util02 nslcd[17437]: [42afe5] no available LDAP server found
Jul 29 03:06:06 ps-rc-util02 nslcd[17437]: [d085f5] no available LDAP server found
Jul 29 03:06:06 ps-rc-util02 nslcd[17437]: [10ae59] no available LDAP server found
Jul 29 03:06:06 ps-rc-util02 nslcd[17437]: [9ab87e] no available LDAP server found
Jul 29 03:06:06 ps-rc-util02 nslcd[17437]: [6e3a1f] no available LDAP server found
Jul 29 03:06:12 ps-rc-util02 nslcd[17437]: [eafde2] failed to bind to LDAP server ldaps://ds-pdc.plansource.local/: Can't contact LDAP server: Connection timed out
Jul 29 03:06:12 ps-rc-util02 nslcd[17437]: [eafde2] no available LDAP server found
Aug 2 11:42:15 ps-rc-util02 nslcd[17437]: [db0739] ldap_result() failed: Can't contact LDAP server
And then shortly after the last log entry is when puppet starts complaining that it found nslcd dead but could not restart it.
I just spot checked some of our Ubuntu hosts and confirmed that there are no log entries at all like this on them. All hosts are configured identically and to use the same LDAP server so this makes me think it's not so much an issue with the LDAP server but rather with something related to nslcd on the CentOS servers.
Based on some other issues I found while google'ing I have tried tuning the idle_timelimit down, it's currently set at 180 however lowering it has only seemed to help a little bit.
Here is our (sanitized) nslcd.conf file:
# /etc/nslcd.conf
# nslcd configuration file. See nslcd.conf(5)
# for details.
# The user and group nslcd should run as.
uid nslcd
gid nslcd
# disconnect after this amount of time (in seconds) of inactivity
idle_timelimit 180
# The location at which the LDAP server(s) should be reachable.
uri ldaps://ds-pdc.domain.local/
# The search base that will be used for all queries.
base dc=domain,dc=local
#base ou=People,dc=domain,dc=local
# The LDAP protocol version to use.
ldap_version 3
# The DN to bind with for normal lookups.
binddn CN=ldap,OU=Service Accounts,OU=IT,DC=domain,DC=local
bindpw *secret*
# The DN used for password modifications by root.
#rootpwmoddn cn=admin,dc=example,dc=com
# SSL options
ssl on
tls_reqcert never
# The search scope.
#scope sub
nss_initgroups_ignoreusers ALLLOCAL
filter passwd (&(&(objectClass=person)(uidNumber=*)))
#filter passwd (&(&(objectClass=person)(uidNumber=*))(unixHomeDirectory=*))
map passwd uid sAMAccountName
map passwd homeDirectory unixHomeDirectory
map passwd gecos displayName
# If you wish to override the shell given by LDAP, uncomment the next line
#map passwd loginShell "/bin/bash"
filter shadow (&(&(objectClass=person)(uidNumber=*)))
#filter shadow (&(&(objectClass=person)(uidNumber=*))(unixHomeDirectory=*))
map shadow uid sAMAccountName
map shadow shadowLastChange pwdLastSet
filter group (&(objectClass=group)(gidNumber=*))
#map group gid member
Any help would be much appreciated.
Thanks, Dan |
-- To unsubscribe send an email to nss-pam-ldapd-users-unsubscribe@lists.arthurdejong.org or see http://lists.arthurdejong.org/nss-pam-ldapd-users/
- nslcd crashing and leaving a pid file,
Dan Finn
- Re: nslcd crashing and leaving a pid file, Dan Finn
- Re: nslcd crashing and leaving a pid file, Arthur de Jong
- Prev by Date: Re: group query regression?
- Next by Date: Re: nslcd crashing and leaving a pid file
- Previous by thread: nslcd crashing and leaving a pid file
- Next by thread: Re: nslcd crashing and leaving a pid file