nslcd errors talking to IPVS cluster of LDAP servers

[Date Prev][Date Next] [Thread Prev][Thread Next]

From: Ken Gaillot <kjgaillo [at] gleim.com>
To: nss-pam-ldapd-users [at] lists.arthurdejong.org
Subject: nslcd errors talking to IPVS cluster of LDAP servers
Date: Thu, 07 Oct 2010 11:02:00 -0400

Hi,

Our shop runs a bunch of Debian lenny servers, some with LDAP-basedshell access using the libnss-ldap package. We decided to givelibnss-ldapd a try on a new server. We ran into problems with our LDAPsetup.

We have three LDAP servers that are hidden behind anIPVS/ldirectord/heartbeat cluster (for load-balancing and simpler clientconfiguration). So the cluster presents a single IP address, and LDAPrequests to it are handed off transparently to one of the real servers.

The first symptom we noticed is that our nightly osiris scan of thesystem would sometimes report that all of the LDAP user accounts weremissing (and re-added a later night).


We traced that issue to log messages like these:

Oct 3 12:42:51 adonis nslcd[1517]: [cdfac0] ldap_result() failed: Can'tcontact LDAP serverOct 3 12:42:51 adonis nslcd[1517]: [cdfac0] ldap_abandon() failed toabandon search: Other (e.g., implementation specific) errorOct 3 12:42:52 adonis nslcd[1517]: [cdfac0] connected to LDAP serverldap://ldap.teamgleim.comOct 3 13:30:20 adonis nslcd[1517]: [578454] ldap_search_ext() failed:Can't contact LDAP serverOct 3 13:30:20 adonis nslcd[1517]: [578454] no available LDAP serverfound, sleeping 1 seconds

Oct  3 13:30:21 adonis nslcd[1517]: [578454] no available LDAP server found

Oct 3 13:30:21 adonis nslcd[1517]: [578454] no available LDAP serverfound, sleeping 29 seconds

It would eventually reconnect, but I'm guessing osiris had already timedout waiting for a response and considered the user accounts to be missing.


I tried several things:

* Setting an idle_timeout of 280 did not clear the errors.

* Restarting nslcd would clear the errors for more than an hour, butthen they would start again.

* Having a cron job run "getent passwd" every four minutes (thuspreventing nslcd from losing its connection to the LDAP server) *did*clear the errors.

* Finally, changing the nslcd LDAP URI from the cluster address to anexplicit list of the three real LDAP servers *did* clear the errors.

Not being an expert in the code, I can only guess that nslcd hasproblems if it tries "reconnecting" to an LDAP server and actually getsconnected to a different server -- some sort of state information aboutthe previous connection must be maintained somewhere.

For now, we'll probably stick with libnss-ldap since we're familiar withit, but I wanted to mention the issue in case there's something simpleI'm missing.


-- Ken Gaillot <kjgaillo@gleim.com>
Network Operations Center, Gleim Publications
--
To unsubscribe send an email to
nss-pam-ldapd-users-unsubscribe@lists.arthurdejong.org or see
http://lists.arthurdejong.org/nss-pam-ldapd-users

nslcd errors talking to IPVS cluster of LDAP servers, Ken Gaillot

Re: nslcd errors talking to IPVS cluster of LDAP servers, Arthur de Jong
- Re: nslcd errors talking to IPVS cluster of LDAP servers, Ken Gaillot
- Re: nslcd errors talking to IPVS cluster of LDAP servers, Ken Gaillot
  - Re: nslcd errors talking to IPVS cluster of LDAP servers, Ken Gaillot

Prev by Date: Re: Erros Compiling nss-pam-ldapd on Solaris 10/x86
Next by Date: Re: nslcd errors talking to IPVS cluster of LDAP servers
Previous by thread: Re: Erros Compiling nss-pam-ldapd on Solaris 10/x86
Next by thread: Re: nslcd errors talking to IPVS cluster of LDAP servers