lists.arthurdejong.org
RSS feed

Re: error writing to client: broken pipe nslcd

[Date Prev][Date Next] [Thread Prev][Thread Next]

Re: error writing to client: broken pipe nslcd



On Fri, 2013-01-25 at 13:32 +0100, Marcus Moeller wrote:
> >>> From what I have read the error 'writing to client: broken pipe
> >>> nslcd' could appear in normal operations with huge groups which
> >>> cannot be read in one go.
> >>>
> >>> Is there a way to workaround this problem, because it creats a large
> >>> load on our Domain Controllers and leads to lags on client side.

The message "error writing to client: broken pipe" can occur if the NSS
module closes the connection before nslcd has finished writing
everything. The most common case is when nslcd is writing group
information with a lot of members and the buffer that the NSS module got
from Glibc is too small.

> >> I was under the impression it had been fixed upstream already? We've
> >> bacported a fix for this issue into RHEL6.4 anyway..
> >
> > At least not in nss-pam-ldapd-0.8.12-2.fc19.x86_64.rpm

The fix that went into 0.8.7 tries to avoid this situation in most cases
by emptying the data stream before closing the connection. This means
that the message should only be shown with misbehaving applications.

> Btw. I was talking about a group with 45717 members. Not sure if that 
> matters.

The change in 0.8.7 only fixes the communication problem and error
messages on the nslcd side but it does mean that if a Glibc-buffer is
overflowed the request is retried. This could account for more LDAP
lookups than strictly necessary.

There is some code in place to not do a complete new lookup if we are
listing all groups in the system but this can't be easily done when
looking up one entry.

Another problem could be that the 45717 members do not fit into the
internal buffer in the NSS module. This buffer is currently 2 MByte max
which should hold that many members unless they have very long
usernames.

Do you know for which requests these messages are happening often (run
nslcd in debug mode to find out)?

A way to (perhaps) mitigate this is to run nscd (or unscd). I think the
buffers that Glibc uses than are not shrunk too often so that the
initial buffer would be large enough quickly and at the very least it
provides caching for some lookups (which should also decrease the load
on your LDAP server).

Hope this helps, if you can provide more information I would be very
interested.

Thanks.

-- 
-- arthur - arthur@arthurdejong.org - http://arthurdejong.org --
-- 
To unsubscribe send an email to
nss-pam-ldapd-users-unsubscribe@lists.arthurdejong.org or see
http://lists.arthurdejong.org/nss-pam-ldapd-users/