lists.arthurdejong.org
RSS feed

nslcd and syslog issues

[Date Prev][Date Next] [Thread Prev][Thread Next]

nslcd and syslog issues



Just to have this recorded somewhere, perhaps it is useful to someone.

At work we have ran into a deadlock issue when combining nss-pam-ldapd
configured to do hostname lookups via LDAP and syslogd which accepts
remote logging requests (using the -r option).

A deadlock can occur when the LDAP server is (temporarily) unavailable
(and possibly also in other circumstances). If a remote syslog message
comes in (perhaps from another host that also noticed the LDAP server is
unavailable) syslogd tries to do a reverse hostname lookup for the
incoming request. This hostname lookup is directed to nslcd.

If nslcd then tries to log a message to syslog (perhaps that the LDAP
server is down) this results in a deadlock condition. The reason for
this is that glibc's syslog() is using blocking I/O without a time-out
mechanism and syslogd is single-threaded and is already servicing a log
message. Now nslcd is waiting for syslogd (cannot write to /dev/log) and
syslogd is waiting for nslcd. Eventually the nslcd request times out (60
seconds in the default configuration) and syslogd continues.

The result is that any process which does logging through syslog (most
any daemon) hangs until syslogd is available again. We've seen that
there are quite a lot of things that stop working (e.g. slapd). The only
way out is to restart syslogd.

Some more information (not specific to nss-pam-ldapd) can be found here:
  http://lkml.org/lkml/2005/3/26/37

The above was observed using syslogd from the sysklogd Debian package.
Other single-threaded syslog implementation may also be affected by
this. Using a multi-threaded syslogd (like rsyslog) works around this
issue.

Suggestions for a better solution than using a multi-threaded syslogd
are more than welcome. Making syslog() not block indefinitely will not
help much because it would still cause significant slowdowns and/or loss
of log messages. Lowering the time-out between the NSS module and nslcd
could be an option but some normal lookups can actually take quite some
time.

Having magic in nslcd to not log to syslog when a request from syslogd
comes in is not a good approach because is very hard to detect the
incoming process in a platform-independent way and it special-cases
syslogd (probably enough different syslogd implementations out there).

From glancing through the source it seems that sysklogd's syslogd only
does hostname lookups (forward and reverse but not username, group or
otherwise) so this should only be a problem if you have ldap for hosts
in /etc/nsswitch.conf.

-- 
-- arthur - arthur@arthurdejong.org - http://arthurdejong.org/ --
--
To unsubscribe send an email to
nss-pam-ldapd-users-unsubscribe@lists.arthurdejong.org or see
http://lists.arthurdejong.org/nss-pam-ldapd-users