RE: [nssldap] nss_ldap: reconnected to LDAP server ldap://localhost after 1 attempt
[
Date Prev][
Date Next]
[
Thread Prev][
Thread Next]
RE: [nssldap] nss_ldap: reconnected to LDAP server ldap://localhost after 1 attempt
- From: "Howard Wilkinson" <howard [at] cohtech.com>
- To: <nssldap [at] padl.com>
- Subject: RE: [nssldap] nss_ldap: reconnected to LDAP server ldap://localhost after 1 attempt
- Date: Mon, 23 Nov 2009 10:43:48 -0000
Brian,
some version numbers might help along with the OS details.
At a guess you have a resource exhaustion somewhere, have you restarted the box
to check the problem still exists. Are any of your filing systems filling up
(/tmp, /var/tmp, /var/log, ...)
You could look at my patches to the latest nss_ldap - they include a rewrite of
the reconnection logic which is more robust that that available in the current
mainstream.
Finally, you might want to check that you do not have bad data in the LDAP
environment.
Howard.
Coherent Technology Limited, 23 Northampton Square, Finsbury, London EC1V 0HL,
United Kingdom
Telephone: +44 20 7690 7075 Mobile: +44 7980 639379
Company Email: coherent@cohtech.com Website: http://www.cohtech.com
<http://www.cohtech.com/>
________________________________
From: owner-nssldap@padl.com on behalf of Brian J. Murrell
Sent: Sun 2009-11-22 16:07
To: nssldap@padl.com
Subject: [nssldap] nss_ldap: reconnected to LDAP server ldap://localhost after
1 attempt
On my "super-server" which is running both my LDAP server and my mail
server (so they communicate on loopback) as of late, my mail server has
been complaining as such:
Nov 22 10:47:01 linux cyrus/lmtpunix[11640]: nss_ldap: reconnecting to LDAP
server...
Nov 22 10:47:01 linux cyrus/lmtpunix[11640]: nss_ldap: reconnected to LDAP
server ldap://localhost after 1 attempt
Nov 22 10:47:01 linux cyrus/lmtpunix[11640]: nss_ldap: reconnecting to LDAP
server...
Nov 22 10:47:01 linux cyrus/lmtpunix[11640]: nss_ldap: reconnected to LDAP
server ldap://localhost after 1 attempt
Nov 22 10:47:13 linux cyrus/lmtpunix[11640]: nss_ldap: reconnecting to LDAP
server...
Nov 22 10:47:14 linux cyrus/lmtpunix[11640]: nss_ldap: reconnected to LDAP
server ldap://localhost after 1 attempt
Nov 22 10:47:14 linux cyrus/lmtpunix[11640]: nss_ldap: reconnecting to LDAP
server...
Nov 22 10:47:14 linux cyrus/lmtpunix[11640]: nss_ldap: reconnected to LDAP
server ldap://localhost after 1 attempt
Nov 22 10:50:04 linux cyrus/lmtpunix[11749]: nss_ldap: reconnecting to LDAP
server...
Nov 22 10:50:05 linux cyrus/lmtpunix[11749]: nss_ldap: reconnected to LDAP
server ldap://localhost after 1 attempt
Nov 22 10:50:05 linux cyrus/lmtpunix[11749]: nss_ldap: reconnecting to LDAP
server...
Nov 22 10:50:05 linux cyrus/lmtpunix[11749]: nss_ldap: reconnected to LDAP
server ldap://localhost after 1 attempt
Nov 22 10:51:07 linux cyrus/lmtpunix[11789]: nss_ldap: reconnecting to LDAP
server...
Nov 22 10:51:07 linux cyrus/lmtpunix[11789]: nss_ldap: reconnected to LDAP
server ldap://localhost after 1 attempt
Nov 22 10:51:07 linux cyrus/lmtpunix[11789]: nss_ldap: reconnecting to LDAP
server...
Nov 22 10:51:07 linux cyrus/lmtpunix[11789]: nss_ldap: reconnected to LDAP
server ldap://localhost after 1 attempt
This is new behaviour on existing software (i.e. no version upgrades in
quite a long time -- certainly no in the time frame that this has
started).
I'm at a loss as to how to start debugging it though.
I have straced the lmtpd process and can see that it's closing the
connection to the ldap server and reopening it frequently:
getsockname(16, {sa_family=AF_INET, sin_port=htons(53883),
sin_addr=inet_addr("127.0.0.1")}, [16]) = 0
getpeername(16, {sa_family=AF_INET, sin_port=htons(389),
sin_addr=inet_addr("127.0.0.1")}, [16]) = 0
stat64("/etc/ldap.conf", {st_mode=S_IFREG|0644, st_size=9210, ...}) = 0
geteuid32() = 76
getsockname(16, {sa_family=AF_INET, sin_port=htons(53883),
sin_addr=inet_addr("127.0.0.1")}, [16]) = 0
getpeername(16, {sa_family=AF_INET, sin_port=htons(389),
sin_addr=inet_addr("127.0.0.1")}, [16]) = 0
write(16, "0\5\2\1\4B\0", 7) = 7
shutdown(16, 2 /* send and receive */) = 0
close(16) = 0
shutdown(-1, 2 /* send and receive */) = -1 EBADF (Bad file descriptor)
close(-1) = -1 EBADF (Bad file descriptor)
...
socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 16
setsockopt(16, SOL_SOCKET, SO_KEEPALIVE, [1], 4) = 0
setsockopt(16, SOL_TCP, TCP_NODELAY, [1], 4) = 0
fcntl64(16, F_GETFL) = 0x2 (flags O_RDWR)
fcntl64(16, F_SETFL, O_RDWR|O_NONBLOCK) = 0
connect(16, {sa_family=AF_INET, sin_port=htons(389),
sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in
progress)
poll([{fd=16, events=POLLOUT|POLLERR|POLLHUP, revents=POLLOUT}], 1, 30000) = 1
getpeername(16, {sa_family=AF_INET, sin_port=htons(389),
sin_addr=inet_addr("127.0.0.1")}, [16]) = 0
fcntl64(16, F_GETFL) = 0x802 (flags O_RDWR|O_NONBLOCK)
fcntl64(16, F_SETFL, O_RDWR) = 0
time(NULL) = 1258904423
time(NULL) = 1258904423
write(16, "0\f\2\1\1`\7\2\1\3\4\0\200\0", 14) = 14
gettimeofday({1258904423, 677944}, NULL) = 0
poll([{fd=16, events=POLLIN|POLLPRI|POLLERR|POLLHUP, revents=POLLIN}], 1,
30000) = 1
read(16, "0\f\2\1\1a\7\n", 8) = 8
read(16, "\1\0\4\0\4\0", 6) = 6
and tshark confirms very short lived connections as well with the client
being the one to terminate the connection, not the LDAP server:
2.978355 127.0.0.1 -> 127.0.0.1 TCP 54633 > 389 [SYN] Seq=0 Win=32792
Len=0 MSS=16396 TSV=1617515729 TSER=0 WS=6
2.978558 127.0.0.1 -> 127.0.0.1 TCP 389 > 54633 [SYN, ACK] Seq=0 Ack=1
Win=32768 Len=0 MSS=16396 TSV=1617515729 TSER=1617515729 WS=6
2.978666 127.0.0.1 -> 127.0.0.1 TCP 54633 > 389 [ACK] Seq=1 Ack=1
Win=32832 Len=0 TSV=1617515729 TSER=1617515729
2.979937 127.0.0.1 -> 127.0.0.1 LDAP bindRequest(1) "<ROOT>" simple
2.980035 127.0.0.1 -> 127.0.0.1 TCP 389 > 54633 [ACK] Seq=1 Ack=15
Win=32768 Len=0 TSV=1617515729 TSER=1617515729
2.980356 127.0.0.1 -> 127.0.0.1 LDAP bindResponse(1) success
2.981019 127.0.0.1 -> 127.0.0.1 TCP 54633 > 389 [ACK] Seq=15 Ack=15
Win=32832 Len=0 TSV=1617515729 TSER=1617515729
2.981369 127.0.0.1 -> 127.0.0.1 LDAP searchRequest(2)
"dc=interlinx,dc=bc,dc=ca" wholeSubtree
...
2.983428 127.0.0.1 -> 127.0.0.1 LDAP searchResEntry(2)
"cn=xxx,ou=Group,dc=interlinx,dc=bc,dc=ca"
2.984569 127.0.0.1 -> 127.0.0.1 LDAP searchResEntry(2)
"cn=xxx,ou=Group,dc=interlinx,dc=bc,dc=ca"
2.999450 127.0.0.1 -> 127.0.0.1 TCP 54633 > 389 [ACK] Seq=187 Ack=229
Win=32832 Len=0 TSV=1617515734 TSER=1617515730
2.999556 127.0.0.1 -> 127.0.0.1 LDAP searchResEntry(2)
"cn=xxx,ou=Group,dc=interlinx,dc=bc,dc=ca" | searchResEntry(2)
"cn=xxx,ou=Group,dc=interlinx,dc=bc,dc=ca" | searchResEntry(2) ... [Packet
size limited during capture]
3.017154 127.0.0.1 -> 127.0.0.1 TCP 54623 > 389 [FIN, ACK] Seq=3064
Ack=46603 Win=49536 Len=0 TSV=1617515738 TSER=1617515016
3.018338 127.0.0.1 -> 127.0.0.1 TCP 389 > 54623 [FIN, ACK] Seq=46603
Ack=3065 Win=48896 Len=0 TSV=1617515739 TSER=1617515738
3.018437 127.0.0.1 -> 127.0.0.1 TCP 54623 > 389 [ACK] Seq=3065
Ack=46604 Win=49536 Len=0 TSV=1617515739 TSER=1617515739
3.038089 127.0.0.1 -> 127.0.0.1 TCP 54633 > 389 [ACK] Seq=187 Ack=2308
Win=36992 Len=0 TSV=1617515744 TSER=1617515734
So the question is, why would nss_ldap be behaving this way, all of a
sudden?
Cheers,
b.