lists.arthurdejong.org
RSS feed

libnss-ldapd on Debian Lenny

[Date Prev][Date Next] [Thread Prev][Thread Next]

libnss-ldapd on Debian Lenny



Hi,

I'm experiencing some problems with 100+ debian 5 lenny machines running 
nss-ldapd connecting to a debian 6 (squeeze) ldap server. I'm using TLS and 
have the machines set up for system-wide pam auth via ldap.

I noticed this problem from the mailing list is similar to what I'm 
experiencing: 
http://lists.arthurdejong.org/nss-pam-ldapd-users/2010/msg00142.html

>From what I'm seeing, it looks like "something" is leaving the ldap connection 
>in a half-connected state. Another search is attempted and, while the client 
>thinks the connection is still current/open, the server does not and the 
>search fails/returns a negative "no such user" response. Going through both 
>the server and client logs, I was frequently unable to locate the connection 
>attempt on the server log side that matched the "no available LDAP server 
>found" or "Can't contact LDAP server" from the client.

The easiest way to witness the problem is to try to connect with SSH. The 
connection occurs something like this:

$ ssh host <anywhere from no to a lengthy delay>
Password: <enter known-good password, wait 10+ seconds>
Password: <enter known-good password, wait 10+ seconds>
Password: <enter known-good password, wait 10+ seconds> <ctl-c> $ ssh host 
<normal/immediate response>
Password: <enter known good password>
Host $

Sometimes, the above process does not work at all (the host denies the user 
exists in the logs with each subsequent 'ssh host' failure. 

With gdm, the process to login for the first time almost always allows the user 
in; however, it takes a long time.

If there's a problem, restarting the system fixes the problem (temporarily).

Logs of the problem (from the client side) are here: 

http://pastie.org/2827238
http://pastie.org/2832431

This is an enigmatic problem, and I've not been able to accurately/consistently 
reproduce it. Sometimes a machine is fine while sitting idle for a day, other 
times sitting idle for a couple minutes results in the above logs. Some 
machines seem to have more problems than others, and I've not been able to 
isolate the cause. 

The problem does appear to not happen as long as there are consistent/regular 
queries going on against the server, for the most part.

I've tried fiddling with 'reconnect_maxsleeptime' and 'idle_timelimit' in 
nss-ldapd.conf but it doesn't seem to have any effect upon the outcome. 

Is there an option I'm missing here, and if so, what might I attempt? I need to 
try to verify that the problem is not the server. I've been banging my head 
against this for entirely too long. 

Thanks,
Ben Hodgens


-- 
To unsubscribe send an email to
nss-pam-ldapd-users-unsubscribe@lists.arthurdejong.org or see
http://lists.arthurdejong.org/nss-pam-ldapd-users/