lists.arthurdejong.org
RSS feed

Re: Requests block when ldap server is not available

[Date Prev][Date Next] [Thread Prev][Thread Next]

Re: Requests block when ldap server is not available



Hi Arthur,

Thank you very much for your prompt reply. 

May I include the strace of what is happening for the command id <localuser>

07:20:37 connect(3, {sa_family=AF_UNIX, sun_path="/var/run/nslcd/socket"}, 23) 
= 0
07:20:37 poll([{fd=3, events=POLLOUT}], 1, 10000) = 1 ([{fd=3, 
revents=POLLOUT}])
07:20:37 sendto(3, "\0\0\0\2\0\4\0\2\0\0\1\366", 12, MSG_NOSIGNAL, NULL, 0) = 12
07:20:37 poll([{fd=3, events=POLLIN}], 1, 60000) = 1 ([{fd=3, 
revents=POLLIN|POLLHUP}])
07:20:57 read(3, "\0\0\0\2\0\4\0\2", 1024) = 8
07:20:57 poll([{fd=3, events=POLLIN}], 1, 60000) = 1 ([{fd=3, 
revents=POLLIN|POLLHUP}])

What is happening in the 20 seconds of the poll? It looks like the process is 
trying to access the LDAP server timeout which is taking 20.

A bit of history, that may help. We have grown and ran out of IP addresses on 
the main subnet, so I moved the hosts that travel out of the office to a new 
subnet. Before the new subnet and when the hosts when not connected to the LAN  
 we saw non-replied ARP messages to find the LDAP server which did not timeout 
and id ran fast. With the new subnet the traffic is going to the default 
gateway and timing out. If this just happened once it wouldn't be noticed but 
it happens all the time.

> You can tweak the various timeout settings in nslcd.conf:
> https://arthurdejong.org/nss-pam-ldapd/nslcd.conf.5#timing_reconnect_options
> 
> The defaults should already be reasonably sane. Decreasing
> bind_timelimit (and perhaps timelimit) should ensure that errors are
> reported quicker.

Looking at the man page 20 seconds seems correct for 2 LDAP servers. I will try 
to reduce bind_timelimit and timelimit but I don't understand 
reconnect_retrytime? Should this be set?

Thank you and kind regards,
Duncan