[SERVER-10383] read_pref_cmd.js failed on Windows 64-bit Created: 31/Jul/13 Updated: 11/Jul/16 Resolved: 25/Oct/13 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Testing Infrastructure |
| Affects Version/s: | None |
| Fix Version/s: | 2.5.4 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Matt Kangas | Assignee: | Randolph Tan |
| Resolution: | Done | Votes: | 0 |
| Labels: | buildbot | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
buildbot Windows 64-bit Build #5626 |
||
| Issue Links: |
|
||||||||||||||||
| Operating System: | ALL | ||||||||||||||||
| Participants: | |||||||||||||||||
| Description |
|
Ren, what happened here? Windows 64-bit Build #5626 July 30 rev f6a77ea http://buildbot.mongodb.org/builders/Windows%2064-bit/builds/5626/steps/test_11/logs/stdio
It looks like exactly the same failure occurred on July 22 Windows 64-bit Build #5622 |
| Comments |
| Comment by auto [ 25/Oct/13 ] |
|
Author: {u'username': u'renctan', u'name': u'Randolph Tan', u'email': u'randolph@10gen.com'}Message: |
| Comment by Matt Kangas [ 22/Aug/13 ] |
|
test disabled in master as part of |
| Comment by Matt Kangas [ 21/Aug/13 ] |
|
Ren, what's the status on this? Resolved? Unresolved but diagnosed as a test issue? Other? |
| Comment by Randolph Tan [ 15/Aug/13 ] |
|
Just realized one thing. If the RSM got socket exceptions because of the socket being closed after the reconfig, then that means that the RSM have not yet refresh itself to the config after the replSetReconfig command. Therefore, if we use P.S. This test is currently disabled temporarily. |
| Comment by Randolph Tan [ 07/Aug/13 ] |
|
Was able to reproduce this with more reliably with debug build on Linux and shortening the ReplicaSetMonitorWatcher interval to 2 sec. This is what happened in the test: 1. ShardingTest setups the sharded cluster with a replicaSet member, then creates a connection to the replica set for internal use. This activates the ReplicaSetMonitorWatcher. Conclusion: |
| Comment by Randolph Tan [ 01/Aug/13 ] |
|
All the errors follow the same pattern: The connection from the monitor (in the shell) to the primary was closed for some unknown reason: [ReplicaSetMonitorWatcher] Socket recv() errno:10053 An established connection was aborted by the software in your host machine. 10.28.48.89:31100 This makes the monitor to label the primary as not ok to use. And the next test expects the command to be routed to the primary and fails because the monitor thinks the primary is down. There is no part in the test that kills any server or do any replset reconfig (which closes all connections) during the test. So I can't think of why it would get this error. Not sure if it is just pure coincidence that this happens only on one machine and always at the same part of the test. |
| Comment by Matt Kangas [ 01/Aug/13 ] |
|
Again: Windows 64-bit Build #5628 July 31 http://buildbot.mongodb.org/builders/Windows%2064-bit/builds/5628/steps/test_11/logs/stdio |
| Comment by Matt Kangas [ 31/Jul/13 ] |
|
Other recent examples on this builder: Windows 64-bit Build #5618 July 23 rev a36c34b Windows 64-bit Build #5613 July 19 Windows 64-bit Build #5612 July 18 Windows 64-bit Build #5604 July 10 rev 8512fdd I don't see any earlier failures on the Windows 64-bit builder within the last 100 builds (since May 17). |