lists.arthurdejong.org
RSS feed

Re: very slow initialization after reboot

[Date Prev][Date Next] [Thread Prev][Thread Next]

Re: very slow initialization after reboot



On Mon, 2019-11-11 at 14:03 -0500, Manhong Dai wrote:
> After reboot, the first 'id <USER>' took about two minutes and
> then failed. Then all following 'id' command work fine. During the
> two minutes of waiting period, I tcpdump-ed the packets on both the
> LDAP client and LDAP server,  but didn't detect any packets until the
> first 'id' command failed.

Hi Manhong,

The logs show that the initial connection seems to be set up but the
BIND operation takes a very long time. It is unclear to me why this
takes so long.

In any case the maximum time to wait for a response can be set with the
timelimit option. This should ensure that the process does not block
for too long. Then the reconnect logic of nslcd will kick in (see the
reconnect_sleeptime and reconnect_retrytime options).

If this is can be traced to some networking or a firewall issue a way
to reset the reconnect timers is to send a SIGUSR1 signal to nslcd
(assuming you use a recent version of nss-pam-ldapd). On Debian-based
systems for example, the /etc/network/if-up.d/nslcd file ensures that
the timers are reset every time networking is restored.

More ideas for debugging this further are running nslcd under strace
(start it as "strace -t -f -o /var/log/nslcd.trace nslcd -d") to
actually see which operation is blocking so long, looking to see if any
network traffic is actually beging sent, seeing whether ldapsearch is
able to perform search queries during the blocking time, trying to
connect with netcat to port 389 of the LDAP to see if there is a
networking issue and looking at the LDAP server logs.

Hope this helps,

-- 
-- arthur - arthur@arthurdejong.org - https://arthurdejong.org/ --