Re: Preventing NSS from querying LDAP for system users

[Date Prev][Date Next] [Thread Prev][Thread Next]
From: Ryan Steele <ryans [at] aweber.com>
To: Arthur de Jong <arthur [at] arthurdejong.org>
Cc: nss-pam-ldapd-users [at] lists.arthurdejong.org
Subject: Re: Preventing NSS from querying LDAP for system users
Date: Fri, 12 Mar 2010 16:51:28 -0500
Hi Arthur,

Thanks for responding!  :)

Arthur de Jong wrote:
> On Fri, 2010-03-12 at 12:27 -0500, Ryan Steele wrote:
>> I should preface this post by stating that I originally sent a similar
>> line of questioning to Launchpad
> [...]
> 
> No problem. I sometimes get the feeling no one looks after the stuff on
> launchpad. :(
> 
>> I see that in version 0.3 and 0.5 respectively, support for the
>> nss_initgroups/nss_initgroups_ignoreusers and bind_policy options were
>> removed from libnss-ldapd.
> 
> This is correct. For bind_policy a simpler retry mechanism was put in
> place using the bind_timelimit, timelimit, reconnect_sleeptime and
> reconnect_maxsleeptime options.
> 
> The reason the nss_initgroups options were in the original nss_ldap was
> mainly due to the complex searches needed to determine group membership
> for users and the problems during booting. nss-pam-ldapd implements
> simpler and faster searches for username to group membership lookups and
> has simpler semantics during boot.
> 
> I think the nss_initgroups option is also ugly because for it to work
> correctly you have to list all local users there.

The problem, however, is that now things like sudo don't work when LDAP is 
unavailable.  Because the root user is not in
LDAP (and rightly so), proxycache cannot cache any of its information.  So when 
LDAP is unavailable, sudo will never
work, unless you set a really lengthy timeout so that you have enough time to 
parse each line, make bind attempts for
each line, and wait for each line to fail by timing out.  Sure, technically 
it's possible to do that, but it's pretty
unusable that way.  And of course, there's then the huge downside of having to 
wait for a looong time to log in and get
a shell as a non-LDAP user if the system cannot reach the LDAP server.

It just seems to me like this simple replacement mechanism might be prettier, 
but at the cost of functionality.  The
nss_initgroups_ignoreusers didn't just help you avoid boot-time deadlock, it 
also made interacting with non-LDAP
accounts much easier when LDAP was unavailable.

> 
>> However, without those options, I'm not sure how to prevent NSS from
>> querying LDAP for users which aren't in LDAP, which can cause a lot of
>> trouble both when the LDAP server is available, by inundating it with
>> requests for which it will issue only negative responses, and when the
>> LDAP server is unavailable, where at best, you have brief
>> interruptions while you wait for one of the various and sundry timeout
>> option thresholds to be reached. The obvious result is that system
>> users lose the ability to operate without being hindered by
>> unsuccessful NSS lookups, which can drive the load up as processes
>> stack up in wait time.
> 
> You cannot prevent NSS lookups to hit LDAP with just the NSS module
> (even with nss_initgroup options). The simplest solution is to use nscd
> for caching. You can combine this with lowering the timeing options
> mentioned before.
> 

Unfortunately, nscd is not a good solution.  It is fraught with many problems, 
and in addition was clearly not designed
with security in mind (doesn't work with TLS/SSL).

> If you want a more reliable caching solution that should also work
> completely off-line you can use the nssov slapd overlay to set up a copy
> of your LDAP server on your workstation (using replication).
> 

Once you start doing this, you're getting in to the realm of "it's probably 
just easier not to use LDAP at all"; the
whole point of using LDAP is to be able to centralize user management.  Once 
you start setting up complete replicas on
every single one of your clients, you're decentralizing the architecture to 
some extent.  This may not seem like a big
deal because OpenLDAP is so efficiently designed, but as a corollary, can you 
imagine what people would say if you told
them you had to install a fully-fledged Active Directory Server on every single 
Windows client?  People would look at
you like you were smoking something.

>> Given that these options are no longer available to those of us who
>> wish to use libnss-ldapd instead of libnss-ldap or nssov, what do the
>> package authors/maintainers/other users suggest to circumvent or
>> otherwise prevent lookups from being made for users who exist locally
>> (root, daemon, www-data, et. al.)?
> 
> The problem here is that this is hard to do. There will always be
> processes that request all users or groups in the system. One of these
> occasions is when determining which groups a user is in (e.g. on login).
> Since it is perfectly legal to have local users in LDAP groups and vice
> versa excluding this always is a bad idea.
> 
> Also if you want to test for that condition (for the username to groups
> lookup) you first have to determine whether a user is a local user or an
> LDAP user. This also requires a lookup.
> 

But this is exactly the thing nss_initgroups_ignoreusers solved.  It is, in my 
opinion, far uglier and far more work to
have to put a bunch of system users in LDAP just so they're usable if LDAP is 
unavailable than it is to have a one-line
configuration option to prevent this problem entirely.  On Ubuntu, there's even 
a utility (nss-update-ignoreusers, iirc)
that does this for you based on a UID threshold (nss_initgroups_miminum_uid or 
something to that effect).

> Again, nscd will cache some lookups (simple username->uid and
> groupname->gid lookups, etc) but not all (list all users or determine
> the groups this user is a member of if I remember correctly).

True, but again, nscd has too many problems and too little security for me to 
be comfortable with it, and it is not
recommended by the OpenLDAP developers.

> [...]
>> Given those two options, I would regrettably have to choose the
>> former, because I would rather have my system services (webserver,
>> root cronjobs, databases) available so that public-facing services
>> like websites and databases aren't negatively affected if my LDAP
>> servers become unavailable for some reason.  I hate having this
>> ultimatum - has anybody else found an answer to this problem?
> 
> Again, if you want a very reliable system with NSS lookups error-free
> even when the network is down, nssov is probably the best route. I have
> also experimented with nss_db and nss-updatedb in the past (which is a
> small improvement over nscd I believe) but I don't have up-to-date
> experience with that.

I agree, but I was hoping to use libnss-ldapd/libpam-ldapd because I feel it is 
more of a "client-side solution".
Despite it's size and efficiency, having to install slapd with a bunch of 
overlays everywhere is kind of like putting in
a thumbtack with a sledgehammer.

> Personally, I would use nscd and lower the values for bind_timelimit and
> reconnect_maxsleeptime so that small outages are handled gracefully
> (information from cache is served) and longer outages do not cause too
> long delays in lookups (btw with version 0.7.0 the default timeout
> values were lowered but they are still pretty high).

I would probably rather use slapd + back-ldap + pcache + nssov (geez, that's a 
lot for a client...) than subject myself
to dealing with nscd - it just detracts too much from security and adds a slew 
of its own problems.  Quoting an OpenLDAP
developer, "it just doesn't work."

> However, I would welcome a patch that adds an option to not perform
> username to groups lookups for local users (this is not trivial trough).

Is it that hard or objectionable to add the nss_initgroups_* options back in?  
It certainly solved the problem for me,
and had a lot of positive side effects.  Before I used that option, each of my 
~220 servers were making TONS of requests
to my LDAP server: every single user, every single cronjob, every single ssh 
command, every single ps, every single
rsync, so on and so forth, were making individuals connections and requests, to 
the point that it overwhelmed the LDAP
server to the point that the load shot through the roof, it stopped responding, 
and all my systems started hanging
because they couldn't get answers from NSS.  Once I stopped all the lookups 
from the non-LDAP users with the
nss_initgroups_ignoreusers option, the problem disappeared.  And yes, I had 
tried things like bind_policy soft, but that
was only a band-aid for when the LDAP server stopped responding because of the 
load.  Although I have increased the
number of LDAP servers, I am (understandably, I think) nervous about subjecting 
my LDAP server to that kind of load
again.  So again, unless you have some deep philosophical objection to 
re-adding those nss_initgroups_* options, is
there a reason that it can't be re-added to reduce the sheer volume of requests 
being made to the LDAP servers?

> Anyway, thanks for raising this issue.

No problem, thanks for taking the time to write the code!

--
To unsubscribe send an email to
nss-pam-ldapd-users-unsubscribe@lists.arthurdejong.org or see
http://lists.arthurdejong.org/nss-pam-ldapd-users
Preventing NSS from querying LDAP for system users, Ryan Steele
- Re: Preventing NSS from querying LDAP for system users, Arthur de Jong
  - Re: Preventing NSS from querying LDAP for system users, Ryan Steele
Prev by Date: Re: Preventing NSS from querying LDAP for system users
Next by Date: Re: Preventing NSS from querying LDAP for system users
Previous by thread: Re: Preventing NSS from querying LDAP for system users
Next by thread: Re: Preventing NSS from querying LDAP for system users