lists.arthurdejong.org
RSS feed

Re: [nssldap] Solaris 10 update 5 - nss_ldap makes nscd dump core

[Date Prev][Date Next] [Thread Prev][Thread Next]

Re: [nssldap] Solaris 10 update 5 - nss_ldap makes nscd dump core



Hi Howard,

On Jan 12, 2010, at 2:51 AM, Howard Wilkinson wrote:

Having got the Solaris stuff built I am still seeing some stability issues. A major one seems to be that the new Solaris code does not support the NSS_TRYAGAIN/ERANGE interface feature to signal buffer too small. The code seems to just get into a TRYAGAIN loop - do you have any information that would suggest whther the interface change has preserved this behaviour.

Not really, but I'm going to involve the engineer that did this work. He may be able to shed some light on your question. His name's Ted Cheng and I think he'll be joining the discussion soon. Who's our friend Bernhard at Sun, and can he help us? We'd like to crack this one too.


I would still like to see your code to compare what you have done with mine. It is likely that I have missed things.

Yeah... sorry about that <looks embarrassed>. I got pulled off into another direction and it dropped off my radar. I'll get that posted ASAP.

Take care,

-Matt

Matthew Hardin
Symas Corporation - The LDAP Guys
http://www.symas.com


Coherent Technology Limited, 23 Northampton Square, Finsbury, London EC1V 0HL, United Kingdom
Telephone: +44 20 3355 6467 Mobile: +44 7980 639379
Company Email: coherent@cohtech.com Website: http://www.cohtech.com <http://www.cohtech.com/ >

________________________________

From: Howard Wilkinson
Sent: Fri 2010-01-08 16:59
To: Howard Wilkinson; Matthew Hardin
Cc: Thomas Glanzmann; Luke Howard; nssldap@padl.com; Bernhard.Thalmayr@Sun.COM
Subject: RE: [nssldap] Solaris 10 update 5 - nss_ldap makes nscd dump core


I have another version of the patch to the bugzilla. This now uses heap allocation rather than stack based. The stack was failing when large groups were being used.

I have also added a piece of conditional compilation to remove the work round for a bug in the openldap library when not being compiled against openldap. This was making the sun native library fail.

Coherent Technology Limited, 23 Northampton Square, Finsbury, London EC1V 0HL, United Kingdom
Telephone: +44 20 3355 6467 Mobile: +44 7980 639379
Company Email: coherent@cohtech.com Website: http://www.cohtech.com <http://www.cohtech.com/ >

________________________________

From: owner-nssldap@padl.com on behalf of Howard Wilkinson
Sent: Wed 2010-01-06 16:34
To: Matthew Hardin
Cc: Thomas Glanzmann; Luke Howard; nssldap@padl.com; Bernhard.Thalmayr@Sun.COM
Subject: RE: [nssldap] Solaris 10 update 5 - nss_ldap makes nscd dump core



I have pushed another patch out to the bugzilla which now seems to work on Solaris 10 and on Linux. I have run some extensive tests on Linux and some limited tests on Solaris and it all seems to be functioning fine so far.

I have made the decision to use stack allocation for some buffer space that is needed when called from the Solaris nscd (this code is done at runtime so will also happen on Linux if the interface ever goes that way) and have made room for a stack checking function (which I have yet to work out how to do).

I may choose to replace this with heap allocated data but will wait for experience reports before deciding this.

Any experiences using this code would be most gratefully received.

Luke any chance of getting this intergrated into the mainstream release?

Coherent Technology Limited, 23 Northampton Square, Finsbury, London EC1V 0HL, United Kingdom
Telephone: +44 20 3355 6467 Mobile: +44 7980 639379
Company Email: coherent@cohtech.com Website: http://www.cohtech.com <http://www.cohtech.com/ > <http://www.cohtech.com/>

________________________________

From: Matthew Hardin [mhardin [at] symas.com]
Sent: Tue 2010-01-05 21:09
To: Howard Wilkinson
Cc: Thomas Glanzmann; Luke Howard; nssldap@padl.com; Bernhard.Thalmayr@Sun.COM
Subject: Re: [nssldap] Solaris 10 update 5 - nss_ldap makes nscd dump core




On Jan 5, 2010, at 2:05 AM, Howard Wilkinson wrote:

Matthew,

I have a partially working implementation. NSCD is calling into
nss_ldap and getting the results back but is not returning the
result back to the getent call. So any pointers would be gratefully
received. I am trying to get this working for a deployment in the
next few weeks so if you have patches I can try that would be very
helpful.

We're going to post the source code today or tomorrow without
encumbrances and you are free to use it as-is or extract and use
whatever information you find useful (well, attribution would be
nice). I'll follow up with a download url when the code is available.


Do you have any idea which hat dropping makes NSCD stop working?
Reading the OpenSolaris code it does look as though there is a lot
of dependencies on a stateful interface in the NSS2 facilities.

Unfortunately not. I do know that merely updating the time stamp on
the nsswitch.conf file will cause NSCD to start working again (until
it gets tired again). We are as puzzled as you are.

Cheers,

-Matt


Howard.

Coherent Technology Limited, 23 Northampton Square, Finsbury, London
EC1V 0HL, United Kingdom
Telephone: +44 20 3355 6467 Mobile: +44 7980 639379
Company Email: coherent@cohtech.com Website: http://www.cohtech.com <http://www.cohtech.com/> <http://www.cohtech.com/> <http://www.cohtech.com/


________________________________

From: Matthew Hardin [mhardin [at] symas.com]
Sent: Mon 2010-01-04 18:00
To: Howard Wilkinson
Cc: Thomas Glanzmann; lukeh@padl.com; nssldap@padl.com; 
Bernhard.Thalmayr@Sun.COM
Subject: Re: [nssldap] Solaris 10 update 5 - nss_ldap makes nscd
dump core



We've developed the SPARKS changes needed for Sol10u5 and later (this
is why nscd was dumping core), but nscd on Sol10u5 and later seems
very fragile and stops working at the drop of a hat. We've been
sitting on the code until we've worked this out, but that's been going
very slowly. We would be happy to share what we have if anyone is
interested.

-Matt

On Jan 4, 2010, at 9:52 AM, Howard Wilkinson wrote:

You will need to apply this after all of my other patches. See 
http://bugzilla.padl.com/show_bug.cgi?id=412

This is very much a work in progress - I am focussing on getting
getent passwd working with nscd running. If I can crack that then
the other changes are 'obvious'.

This has been compiled on both Sparc and x86 Solaris 10, but only
tested on Sparc so far.

Let me know how you get on!

Coherent Technology Limited, 23 Northampton Square, Finsbury, London
EC1V 0HL, United Kingdom
Telephone: +44 20 3355 6467 Mobile: +44 7980 639379
Company Email: coherent@cohtech.com Website: http:// www.cohtech.com <http://www.cohtech.com/> <http://www.cohtech.com/>
<http://www.cohtech.com/>  <http://www.cohtech.com/


________________________________

From: Thomas Glanzmann [thomas [at] glanzmann.de]
Sent: Mon 2010-01-04 16:04
To: Howard Wilkinson
Subject: Re: [nssldap] Solaris 10 update 5 - nss_ldap makes nscd
dump core



Hello Howard,

Any help that you, or anybody else can give to fix this problem will be gratefully received. This has been a problem for over a week now.

could you please send me you're build instructions and patches. I'll
be
happy to help track down the getent bug. I'm also interested in
providing a backport.

     Thomas


<nss_ldap-265-solarisfixes.patch>