[SERVER-10125] stepdown.js failing on buildbot-special V2.4 Linux 64-bit Subscription SUSE 11 Created: 08/Jul/13 Updated: 11/Jul/16 Resolved: 24/Jul/13 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Testing Infrastructure |
| Affects Version/s: | None |
| Fix Version/s: | 2.5.2 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Matt Kangas | Assignee: | Randolph Tan |
| Resolution: | Done | Votes: | 0 |
| Labels: | buildbot | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
buildbot-special: V2.4 Linux 64-bit Subscription SUSE 11 build #144 |
||
| Operating System: | ALL |
| Participants: |
| Description |
|
http://buildbot-special.10gen.com/builders/V2.4%20Linux%2064-bit%20Subscription%20SUSE%2011/builds/144
Preceding this are repeated authentication failures. Possibly the cause?
|
| Comments |
| Comment by auto [ 18/Jul/13 ] | |||||||||||||||||||
|
Author: {u'username': u'renctan', u'name': u'Randolph Tan', u'email': u'randolph@10gen.com'}Message: replace jsTest.attempt to assert.soon | |||||||||||||||||||
| Comment by Randolph Tan [ 16/Jul/13 ] | |||||||||||||||||||
|
If my diagnosis is correct, I believe this is a separate issue. | |||||||||||||||||||
| Comment by Matt Kangas [ 15/Jul/13 ] | |||||||||||||||||||
|
stepdown2.js failed on Nightly Linux 64-bit SSL SUSE 11 Build #497 July 15 Is this a different issue? http://buildbot-special.10gen.com/builders/Nightly%20Linux%2064-bit%20SSL%20SUSE%2011/builds/497/steps/test_3/logs/stdio
m31001 says it "will terminate after current cmd ends" but it never actually exits? | |||||||||||||||||||
| Comment by Spencer Brody (Inactive) [ 11/Jul/13 ] | |||||||||||||||||||
|
I have also not been able to reproduce this, and it has passed in future runs on the same builder... | |||||||||||||||||||
| Comment by Spencer Brody (Inactive) [ 11/Jul/13 ] | |||||||||||||||||||
|
This failure is confusing me. It looks a lot like a timing bug, notice how
prints moments before
but the weird thing is that from the time the old primary steps down:
to the time the test exits after timing out waiting for the new primary:
looks to be just 36 seconds, though the test is saying that it waited 60 seconds before aborting. I have no idea why the timing of the timeout waiting for the primary seems so far off... | |||||||||||||||||||
| Comment by Matt Kangas [ 09/Jul/13 ] | |||||||||||||||||||
|
Spencer, does this look like a new auth-related issue? Or just a timeout that we should quarantine? Note that it's on V2.4 |