lists.arthurdejong.org
RSS feed

[nssldap] nss_ldap: reconnected to LDAP server ldap://localhost after 1 attempt

[Date Prev][Date Next] [Thread Prev][Thread Next]

[nssldap] nss_ldap: reconnected to LDAP server ldap://localhost after 1 attempt



On my "super-server" which is running both my LDAP server and my mail
server (so they communicate on loopback) as of late, my mail server has
been complaining as such:

Nov 22 10:47:01 linux cyrus/lmtpunix[11640]: nss_ldap: reconnecting to LDAP 
server...
Nov 22 10:47:01 linux cyrus/lmtpunix[11640]: nss_ldap: reconnected to LDAP 
server ldap://localhost after 1 attempt
Nov 22 10:47:01 linux cyrus/lmtpunix[11640]: nss_ldap: reconnecting to LDAP 
server...
Nov 22 10:47:01 linux cyrus/lmtpunix[11640]: nss_ldap: reconnected to LDAP 
server ldap://localhost after 1 attempt
Nov 22 10:47:13 linux cyrus/lmtpunix[11640]: nss_ldap: reconnecting to LDAP 
server...
Nov 22 10:47:14 linux cyrus/lmtpunix[11640]: nss_ldap: reconnected to LDAP 
server ldap://localhost after 1 attempt
Nov 22 10:47:14 linux cyrus/lmtpunix[11640]: nss_ldap: reconnecting to LDAP 
server...
Nov 22 10:47:14 linux cyrus/lmtpunix[11640]: nss_ldap: reconnected to LDAP 
server ldap://localhost after 1 attempt
Nov 22 10:50:04 linux cyrus/lmtpunix[11749]: nss_ldap: reconnecting to LDAP 
server...
Nov 22 10:50:05 linux cyrus/lmtpunix[11749]: nss_ldap: reconnected to LDAP 
server ldap://localhost after 1 attempt
Nov 22 10:50:05 linux cyrus/lmtpunix[11749]: nss_ldap: reconnecting to LDAP 
server...
Nov 22 10:50:05 linux cyrus/lmtpunix[11749]: nss_ldap: reconnected to LDAP 
server ldap://localhost after 1 attempt
Nov 22 10:51:07 linux cyrus/lmtpunix[11789]: nss_ldap: reconnecting to LDAP 
server...
Nov 22 10:51:07 linux cyrus/lmtpunix[11789]: nss_ldap: reconnected to LDAP 
server ldap://localhost after 1 attempt
Nov 22 10:51:07 linux cyrus/lmtpunix[11789]: nss_ldap: reconnecting to LDAP 
server...
Nov 22 10:51:07 linux cyrus/lmtpunix[11789]: nss_ldap: reconnected to LDAP 
server ldap://localhost after 1 attempt

This is new behaviour on existing software (i.e. no version upgrades in
quite a long time -- certainly no in the time frame that this has
started).

I'm at a loss as to how to start debugging it though.

I have straced the lmtpd process and can see that it's closing the
connection to the ldap server and reopening it frequently:

getsockname(16, {sa_family=AF_INET, sin_port=htons(53883), 
sin_addr=inet_addr("127.0.0.1")}, [16]) = 0
getpeername(16, {sa_family=AF_INET, sin_port=htons(389), 
sin_addr=inet_addr("127.0.0.1")}, [16]) = 0
stat64("/etc/ldap.conf", {st_mode=S_IFREG|0644, st_size=9210, ...}) = 0
geteuid32()                             = 76
getsockname(16, {sa_family=AF_INET, sin_port=htons(53883), 
sin_addr=inet_addr("127.0.0.1")}, [16]) = 0
getpeername(16, {sa_family=AF_INET, sin_port=htons(389), 
sin_addr=inet_addr("127.0.0.1")}, [16]) = 0
write(16, "0\5\2\1\4B\0", 7)            = 7
shutdown(16, 2 /* send and receive */)  = 0
close(16)                               = 0
shutdown(-1, 2 /* send and receive */)  = -1 EBADF (Bad file descriptor)
close(-1)                               = -1 EBADF (Bad file descriptor)
...
socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 16
setsockopt(16, SOL_SOCKET, SO_KEEPALIVE, [1], 4) = 0
setsockopt(16, SOL_TCP, TCP_NODELAY, [1], 4) = 0
fcntl64(16, F_GETFL)                    = 0x2 (flags O_RDWR)
fcntl64(16, F_SETFL, O_RDWR|O_NONBLOCK) = 0
connect(16, {sa_family=AF_INET, sin_port=htons(389), 
sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in 
progress)
poll([{fd=16, events=POLLOUT|POLLERR|POLLHUP, revents=POLLOUT}], 1, 30000) = 1
getpeername(16, {sa_family=AF_INET, sin_port=htons(389), 
sin_addr=inet_addr("127.0.0.1")}, [16]) = 0
fcntl64(16, F_GETFL)                    = 0x802 (flags O_RDWR|O_NONBLOCK)
fcntl64(16, F_SETFL, O_RDWR)            = 0
time(NULL)                              = 1258904423
time(NULL)                              = 1258904423
write(16, "0\f\2\1\1`\7\2\1\3\4\0\200\0", 14) = 14
gettimeofday({1258904423, 677944}, NULL) = 0
poll([{fd=16, events=POLLIN|POLLPRI|POLLERR|POLLHUP, revents=POLLIN}], 1, 
30000) = 1
read(16, "0\f\2\1\1a\7\n", 8)           = 8
read(16, "\1\0\4\0\4\0", 6)             = 6

and tshark confirms very short lived connections as well with the client
being the one to terminate the connection, not the LDAP server:

  2.978355    127.0.0.1 -> 127.0.0.1    TCP 54633 > 389 [SYN] Seq=0 Win=32792 
Len=0 MSS=16396 TSV=1617515729 TSER=0 WS=6
  2.978558    127.0.0.1 -> 127.0.0.1    TCP 389 > 54633 [SYN, ACK] Seq=0 Ack=1 
Win=32768 Len=0 MSS=16396 TSV=1617515729 TSER=1617515729 WS=6
  2.978666    127.0.0.1 -> 127.0.0.1    TCP 54633 > 389 [ACK] Seq=1 Ack=1 
Win=32832 Len=0 TSV=1617515729 TSER=1617515729
  2.979937    127.0.0.1 -> 127.0.0.1    LDAP bindRequest(1) "<ROOT>" simple 
  2.980035    127.0.0.1 -> 127.0.0.1    TCP 389 > 54633 [ACK] Seq=1 Ack=15 
Win=32768 Len=0 TSV=1617515729 TSER=1617515729
  2.980356    127.0.0.1 -> 127.0.0.1    LDAP bindResponse(1) success 
  2.981019    127.0.0.1 -> 127.0.0.1    TCP 54633 > 389 [ACK] Seq=15 Ack=15 
Win=32832 Len=0 TSV=1617515729 TSER=1617515729
  2.981369    127.0.0.1 -> 127.0.0.1    LDAP searchRequest(2) 
"dc=interlinx,dc=bc,dc=ca" wholeSubtree 
...
  2.983428    127.0.0.1 -> 127.0.0.1    LDAP searchResEntry(2) 
"cn=xxx,ou=Group,dc=interlinx,dc=bc,dc=ca" 
  2.984569    127.0.0.1 -> 127.0.0.1    LDAP searchResEntry(2) 
"cn=xxx,ou=Group,dc=interlinx,dc=bc,dc=ca" 
  2.999450    127.0.0.1 -> 127.0.0.1    TCP 54633 > 389 [ACK] Seq=187 Ack=229 
Win=32832 Len=0 TSV=1617515734 TSER=1617515730
  2.999556    127.0.0.1 -> 127.0.0.1    LDAP searchResEntry(2) 
"cn=xxx,ou=Group,dc=interlinx,dc=bc,dc=ca" | searchResEntry(2) 
"cn=xxx,ou=Group,dc=interlinx,dc=bc,dc=ca" | searchResEntry(2) ...  [Packet 
size limited during capture]
  3.017154    127.0.0.1 -> 127.0.0.1    TCP 54623 > 389 [FIN, ACK] Seq=3064 
Ack=46603 Win=49536 Len=0 TSV=1617515738 TSER=1617515016
  3.018338    127.0.0.1 -> 127.0.0.1    TCP 389 > 54623 [FIN, ACK] Seq=46603 
Ack=3065 Win=48896 Len=0 TSV=1617515739 TSER=1617515738
  3.018437    127.0.0.1 -> 127.0.0.1    TCP 54623 > 389 [ACK] Seq=3065 
Ack=46604 Win=49536 Len=0 TSV=1617515739 TSER=1617515739
  3.038089    127.0.0.1 -> 127.0.0.1    TCP 54633 > 389 [ACK] Seq=187 Ack=2308 
Win=36992 Len=0 TSV=1617515744 TSER=1617515734

So the question is, why would nss_ldap be behaving this way, all of a
sudden?

Cheers,
b.