RSS feed

Re: Preventing NSS from querying LDAP for system users

[Date Prev][Date Next] [Thread Prev][Thread Next]

Re: Preventing NSS from querying LDAP for system users

Hi Arthur,

Thanks again for being so responsive, it's nice to be able to maintain such a pleasant and open line of communication with a package maintainer.

Arthur de Jong wrote:
On Fri, 2010-03-12 at 16:51 -0500, Ryan Steele wrote:
The problem, however, is that now things like sudo don't work when
LDAP is unavailable.

I have done some tests and if you lower the reconnect_maxsleeptime
option the delay is limited.

I stopped nscd and used sudo as a local user with the LDAP server
unavailable. With reconnect_maxsleeptime set to 3 seconds the delay is 2
to 4 seconds (even lower in some cases).

Hm.... I wonder what the difference is, then, between your setup and my own, where that 2-4 second delay applies to every single non-commented line in the sudoers file, as a bind is made to check LDAP for group memberships and authentication/authorization. So, I end up with (2-4 seconds) * (20 or so lines) = 40-80 second delay, which is far longer than the sleep/timeout thresholds are set, so as to reduce the delay seen by local users as they wait for NSS connections to fail and authentication to fall back to /etc flat files (i.e., for things like SSH).

The reason for this is that nslcd keeps internal state on the
availability of the LDAP server and will only try once for every request
if the LDAP server was down before (and the retry-mechanism has already
timed out there). If there are multiple requests (sudo seems to want to
known the groups the root user is in a couple of times for some reason)
some requests fail faster than the first request.

nscd also seems to cache username to groups lookups in this case so it
should also help when the LDAP server is unavailable.

FWIW, if I understand what you're saying correctly, the memberof slapd overlay can do this as well.

Unfortunately, nscd is not a good solution.  It is fraught with many
problems, and in addition was clearly not designed with security in
mind (doesn't work with TLS/SSL).

With nss-pam-ldapd there should be no security implications whether nscd
is used or not (as far as I'm aware). There are some problems with nscd
(see e.g. [0] and [1]) but it shouldn't affect your connection to the
LDAP server.

Also, nscd is a good solution to lower the load on your LDAP server. It
is a cache for performance, not for off-line operation.

There may be a problem in unreliable setups though: nscd also caches
negative hits which means that if LDAP is unavailable it may cache that
some user (which could exist in LDAP) does not exist.

Well, there are a lot of documented issues with nscd (corrupting its cache and crashing and causing service outages due to inefficiencies that cause it to max out the number of permitted open file descriptors, the litany of independent historical issues causing it to consume CPU cores and dominate CPU cycles, dog-slow serial processing of DNS requests, and the list goes on and on...), which is why the OpenLDAP developers recommend against its use, but my biggest gripe with it is that it does not seem to work with TLS and SSL when the LDAP server is unavailable. Specifically, if the LDAP server becomes available, you cannot use TLS/SSL interchangeably with nscd. In fact, with nscd, every time you change security strength factors, you have to blow away the local nscd database for it to start working again. For folks who allow plaintext and TLS/SSL connections (I require a minimum SSF of 128 bits), the same issue applies - if nscd caches the plaintext auth credentials, and you then lose a connection with the LDAP server, you are *forced* to use plaintext auth when dealing with the nscd cache until you delete it, which is just not acceptable. I hate to rail against nscd so much, but there are a bunch of little quirks like that with nscd that make it a real headache to deal with, so given the choice between that and slapd/back-ldap/pcache[/nssov] (which were designed explicitly to remove the need for nscd and tend to work much better), I definitely have to lean toward the latter. And as far as caching negative hits, the proxycache overlay gives you the option of doing that too if you so choose, instead of forcing the issue, which is something nscd tends to do a lot, as mentioned above.
Once you start doing this, you're getting in to the realm of "it's
probably just easier not to use LDAP at all"; [...] can you imagine
what people would say if you told them you had to install a
fully-fledged Active Directory Server on every single Windows client?
People would look at you like you were smoking something.

Apparently slapd can be set up to be really lightweight but I have yet
to do some experimenting with such a set-up.

If you want reliability and the connection to your LDAP server is not
reliable, you need to cache or replicate the data to a point from where
you do have reliable access.

Yeah, it is very lightweight (Howard Chu even reports having it run on his Google G1 smartphone with something like a scant 1.5MB footprint), it was more the principle of having to set up a server, complete with cn=config heirarchy and overlays, on every single client instead of being able to just use something *really* lightweight and client-oriented like ldap.conf (i.e., with PADL's nss-ldap/pam-ldap). It's not that my connection is unreliable, I just want to cover edge cases that could have major effects on my user/customer base. Maybe it's just years of being a sysadmin and being concerned with uptime (for which I've heard terms like "practical paranoia" thrown around), but things like that always jump to the forefront of my mind when configuring core pieces of an infrastructure like directory services.

So again, unless you have some deep philosophical objection to
re-adding those nss_initgroups_* options, is there a reason that it
can't be re-added to reduce the sheer volume of requests being made to
the LDAP servers?

An nss_initgroups_ignoreusers option could be implemented but I don't
think it solves the underlying problem here (it basically hides a

I agree, but without any cleaner, more effective, superior alternative, it seems like the best "band-aid" to me. IMHO, the real problem is not having an easier mechanism for interrupting or aborting the chain of evaluation for NSS resources, e.g. files, ldap, nis, etc., once a result has successfully obtained. Even if one understands that doing so might not give you a "complete picture", it would still give one the option of solving this issue without all these so-called "band-aids" that mask the symptoms without treating the root cause. I understand that the RFC's state that to be "truly comprehensive", NSS must collate the results from every specified name service in nsswitch.conf, but given the number of people with this problem, and the amount of effort that has been spent developing solutions for this problem that operate more on the periphery and simply mask the behavior (nss_initgroups_*, bind_policy, various individual and intertwined timeout and sleep parameters), I'm surprised that such a faculty has never been seriously discussed (AFAIK) in the context of nsswitch.conf (something simple like a [found=stop] lookup reaction would be so trivial in comparison...).

Anyway, attached is a patch (against svn but not yet in svn) that
implements this option. Testing and feedback is welcome. There is one
known issue (that I'm going to ignore) is that username comparison is
case insensitive. So if you add a joe to nss_initgroups_ignoreusers and
have a Joe LDAP user, lookups for Joe would not return any LDAP groups.

Duly noted, thanks for making mention of it. Just out of curiosity, why the decision to ignore it? I'm fine with that (and could always patch it locally if I decided otherwise), just a little inquisitive is all. :)

Note that a special value ALLLOCAL was introduced. This adds all
non-LDAP users to this list (suggestions for a better name are welcome).

Works for me. FWIW, on Ubuntu (and this may or may not be true of Debian and its other derivatives as well), there is a utility called nss-update-ignoreusers which will add all users below an arbitrary uid (specified by the nss_initgroups_minimum_uid option) to the file. But your option is probably better, since it's a) more portable and b) does not force your local users to fall within a specific uid range (the 'nobody' user in particular comes to mind, which is typically 65534).

Cheers, and again, thanks for making yourself available for dialogue and input!

To unsubscribe send an email to or see