lists.arthurdejong.org
RSS feed

[nssldap] RE: nss_ldap: reconnected to LDAP server ldap://localhost after 1 attempt

[Date Prev][Date Next] [Thread Prev][Thread Next]

[nssldap] RE: nss_ldap: reconnected to LDAP server ldap://localhost after 1 attempt



On Mon, 2009-11-23 at 10:43 +0000, Howard Wilkinson wrote: 
> Brian,

Hi Howard, 

> some version numbers might help along with the OS details.

Yeah, somewhat.  But the overriding issue is that this has been stable
(and not doing what it's doing now) for many, many months (over a years
worth of them probably), since it was installed and it's Ubuntu Hardy
(LTS) so it's quite established.  In fact LTS is a two year O/S with
renewing coming up in the spring, so it's nearly two years established
with this new behaviour only a few weeks old.

But for versions... O/S is Ubuntu Hardy (8.04), which include ntp
4.2.4p4+dfsg-3ubuntu2.2 and libnss-ldap 258-1ubuntu3. 

> At a guess you have a resource exhaustion somewhere,

Yeah.  That's the sort of thing that occurred to me too, but this is a
pretty light duty server with less clients than I have fingers and load
has not changed at all in the last many years.

> have you restarted the box to check the problem still exists.

Funny enough, I had to restart it this morning for otherwise unrelated
reasons, and yeah, still doing it.

> Are any of your filing systems filling up (/tmp, /var/tmp, /var/log, ...)

Nope. 

> You could look at my patches to the latest nss_ldap - they include a rewrite 
> of the reconnection logic which is more robust that that available in the 
> current mainstream.

Yeah.  I'm aware of your patches, but as I said, this has been stable
for nearly two years until a couple of weeks ago.  There really should
be no reason to add new code to have it return to stability.  It really
only needs analysing why the connection has started failing so
frequently all of a sudden.  And we do know that it's nss_ldap that
closing the connection, not the LDAP server. 

> Finally, you might want to check that you do not have bad data in the LDAP 
> environment.

Hrm.  Perhaps.

I guess what I was more looking for was some debug that could be enabled
in nss_ldap to tell me about connection events.  And even if there is no
debug of that nature, I'm happy to insert some.  I first need to be able
to replicate the connection dropping problem in a test harness though
and for that some understanding of how nss_ldap is supposed to work.

If I were to write a little program to fetch some NSS information, is
the nss_ldap library supposed to create a new connection to the LDAP
server on a first query of NSS info (gethostbyname(), say) and then
maintain that connection for any further queries until the process dies?
That I can test quite easily I'd say.

b.