lists.arthurdejong.org
RSS feed

[nssldap] Reconnect logic in nss_ldap

[Date Prev][Date Next] [Thread Prev][Thread Next]

[nssldap] Reconnect logic in nss_ldap



There are a number of timeouts and a backoff policy in the current 
implementation of nss_ldap (265) which I have reimplemented in my set of 
patches that have been posted as part of bug number 412. These are driven by 
number of second timers. The relevant configuration items are:
 

*       
        bind_policy - which can take 4 values: hard, hard_init, hard_open & 
soft. Currently all of the hard values are treated the same.
*       
        nss_reconnect_tries - which defaults to 5 and limits then number of 
times a connection attempt will be made before the code gives up.
*       
        nss_reconnect_sleeptime - which defaults to 4 and is the minimum amount 
of time the code will sleep between connection attempts. This is a number of 
seconds.
*       
        nss_reconnect_maxsleeptime - which default to 64 and is the maximum 
amount of time the code should sleep between connection attempts. The actual 
sleep time starts at nss_reconnect_sleeptime and doubles each time the 
connections have failed until it exceeds the nss_reconnect_maxsleeptime. So 
setting this to 65 will allow the last sleep to be 128 seconds.
*       
        nss_reconnect_maxconntries - which defaults to 2 and is misnamed. This 
is the maximum number of connection tries that will happen before the code 
starts to use the backoff algorithm. While the try count is below this number 
the code will retry immediately.

With a soft bind_policy the code will give up immediately if a connection to 
all of the servers provisioned fails. With a hard bind_policy then the code 
will enter into the retry loops.

The exponential backoff algorithm is clunky and probably should be configurable 
as one of: exponential, linear, constant, progressive, where:

*       exponential is as currently implemented and doubles the timeout on each 
loop.
*       linear is where the timeout is equal to the number of tries times the 
initial sleep time
*       constant just sleep the same sleep time every time around the loop
*       progressive uses a second increment which is added onto the last sleep 
time to produce the next one every time round the loop.

The logic around the maxsleeptime should be changed so that it does what it 
says and limits the backoff to this maximum.

The maxconntries variable should be aliased to another name which is more 
meaningful (suggestions welcomed) such as nss_reconnect_nosleeptries.

Also, with modern processors and communications networks it probably makes 
sense to allow the sleep times to be expressed as fractions of a second. Given 
that I would propose changing the code to use nanosleep if available, usleep if 
this is not available and sleep as a last resort.

Does anybody have any comments or additional suggestions to make around this 
subject. I will probably implement patches in this area this weekend, so 
responses ASAP please.

Howard.

Coherent Technology Limited, 23 Northampton Square, Finsbury, London EC1V 0HL, 
United Kingdom
Telephone: +44 20 7690 7075 Mobile: +44 7980 639379
Company Email: coherent@cohtech.com Website: http://www.cohtech.com 
<http://www.cohtech.com/>