Re: [nssldap] Reconnect logic in nss_ldap

[Date Prev][Date Next] [Thread Prev][Thread Next]

From: Howard Chu <hyc [at] highlandsun.com>
To: Howard Wilkinson <howard [at] cohtech.com>
Cc: nssldap [at] padl.com
Subject: Re: [nssldap] Reconnect logic in nss_ldap
Date: Thu, 03 Dec 2009 12:32:03 -0800

Howard Wilkinson wrote:

There are a number of timeouts and a backoff policy in the current

implementation of nss_ldap (265) which I have reimplemented in my set of
patches that have been posted as part of bug number 412. These are driven by
number of second timers. The relevant configuration items are:


* bind_policy - which can take 4 values: hard, hard_init, hard_open& soft.

Currently all of the hard values are treated the same.

* nss_reconnect_tries - which defaults to 5 and limits then number of times
a

connection attempt will be made before the code gives up.

* nss_reconnect_sleeptime - which defaults to 4 and is the minimum amount
of

time the code will sleep between connection attempts. This is a number of 
seconds.

* nss_reconnect_maxsleeptime - which default to 64 and is the maximum
amount

of time the code should sleep between connection attempts. The actual sleep
time starts at nss_reconnect_sleeptime and doubles each time the connections
have failed until it exceeds the nss_reconnect_maxsleeptime. So setting this
to 65 will allow the last sleep to be 128 seconds.

* nss_reconnect_maxconntries - which defaults to 2 and is misnamed. This
is

the maximum number of connection tries that will happen before the code starts
to use the backoff algorithm. While the try count is below this number the
code will retry immediately.


With a soft bind_policy the code will give up immediately if a connection
to

all of the servers provisioned fails. With a hard bind_policy then the code
will enter into the retry loops.


The exponential backoff algorithm is clunky and probably should be

configurable as one of: exponential, linear, constant, progressive, where:


* exponential is as currently implemented and doubles the timeout on each
loop. * linear is where the timeout is equal to the number of tries times
the

initial sleep time

* constant just sleep the same sleep time every time around the loop *
progressive uses a second increment which is added onto the last sleep

time to produce the next one every time round the loop.


The logic around the maxsleeptime should be changed so that it does what
it

says and limits the backoff to this maximum.


The maxconntries variable should be aliased to another name which is more

meaningful (suggestions welcomed) such as nss_reconnect_nosleeptries.


Also, with modern processors and communications networks it probably makes

sense to allow the sleep times to be expressed as fractions of a second. Given
that I would propose changing the code to use nanosleep if available, usleep
if this is not available and sleep as a last resort.


Does anybody have any comments or additional suggestions to make around
this

subject. I will probably implement patches in this area this weekend, so
responses ASAP please.

Too many nss_reconnect parameters, too much to document/remember. You mighttry implementing something similar to the OpenLDAP syncrepl retry parameter,which is a list of <interval> <number> pairs. This obviates the need to havedifferent algorithms embedded in the code.

E.g., "10 1 20 1 40 1 80 1 160 +" would be a simple exponential backoff, witha maximum delay of 160 seconds repeated indefinitely.

"20 10" would be a constant 20 second retry, repeated 10 times, and thenstopping. And so on.

I don't believe sub-second resolution is useful here, and it certainly isn'tworth the additional effort in finding all of the system-dependent variationson the theme.


--
  -- Howard Chu
  CTO, Symas Corp.           http://www.symas.com
  Director, Highland Sun     http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/

[nssldap] Reconnect logic in nss_ldap, Howard Wilkinson
- Re: [nssldap] Reconnect logic in nss_ldap, Howard Chu

Prev by Date: [nssldap] Reconnect logic in nss_ldap
Next by Date: [nssldap] More patches to nss_ldap 265
Previous by thread: [nssldap] Reconnect logic in nss_ldap
Next by thread: [nssldap] More patches to nss_ldap 265