Re: Requests block when ldap server is not available
[
Date Prev][
Date Next]
[
Thread Prev][
Thread Next]
Re: Requests block when ldap server is not available
- From: duncan-lists [at] uniqfeed.com
- To: Arthur de Jong <arthur [at] arthurdejong.org>
- Cc: "nss-pam-ldapd-users [at] lists.arthurdejong.org" <nss-pam-ldapd-users [at] lists.arthurdejong.org>
- Reply-to: duncan-lists [at] uniqfeed.com
- Subject: Re: Requests block when ldap server is not available
- Date: Wed, 6 May 2020 06:59:55 +0200 (CEST)
Hi Arthur,
Thank you very much for your prompt reply.
May I include the strace of what is happening for the command id <localuser>
07:20:37 connect(3, {sa_family=AF_UNIX, sun_path="/var/run/nslcd/socket"}, 23)
= 0
07:20:37 poll([{fd=3, events=POLLOUT}], 1, 10000) = 1 ([{fd=3,
revents=POLLOUT}])
07:20:37 sendto(3, "\0\0\0\2\0\4\0\2\0\0\1\366", 12, MSG_NOSIGNAL, NULL, 0) = 12
07:20:37 poll([{fd=3, events=POLLIN}], 1, 60000) = 1 ([{fd=3,
revents=POLLIN|POLLHUP}])
07:20:57 read(3, "\0\0\0\2\0\4\0\2", 1024) = 8
07:20:57 poll([{fd=3, events=POLLIN}], 1, 60000) = 1 ([{fd=3,
revents=POLLIN|POLLHUP}])
What is happening in the 20 seconds of the poll? It looks like the process is
trying to access the LDAP server timeout which is taking 20.
A bit of history, that may help. We have grown and ran out of IP addresses on
the main subnet, so I moved the hosts that travel out of the office to a new
subnet. Before the new subnet and when the hosts when not connected to the LAN
we saw non-replied ARP messages to find the LDAP server which did not timeout
and id ran fast. With the new subnet the traffic is going to the default
gateway and timing out. If this just happened once it wouldn't be noticed but
it happens all the time.
> You can tweak the various timeout settings in nslcd.conf:
> https://arthurdejong.org/nss-pam-ldapd/nslcd.conf.5#timing_reconnect_options
>
> The defaults should already be reasonably sane. Decreasing
> bind_timelimit (and perhaps timelimit) should ensure that errors are
> reported quicker.
Looking at the man page 20 seconds seems correct for 2 LDAP servers. I will try
to reduce bind_timelimit and timelimit but I don't understand
reconnect_retrytime? Should this be set?
Thank you and kind regards,
Duncan