[SERVER-14690] no_chaining.js is flakey Created: 06/Jun/14 Updated: 29/Jul/14 Resolved: 25/Jul/14 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | 2.7.4 |
| Type: | Improvement | Priority: | Critical - P2 |
| Reporter: | Mark Benvenuto | Assignee: | Matt Dannenberg |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||
| Participants: | |||||
| Linked BF Score: | 0 | ||||
| Description |
|
The problem was a race the forceSync() assert.soon, whereby the node would not have a chance to begin syncing before being instructed to change sync source again. This failure was only observed on linux32 and linux32-debug since they have slower builders. The fix involves checking for replication progress in an assert.soon inside the original assert.soon before retrying. Here are the failures. It seems to be some sort of startup failure. It is likely 32-bit specific given this set of failures. http://mci.10gen.com/ui/task/mongodb_mongo_master_linux_32_43d2ae25b1872273cb227ada251315cbaf817534_14_06_06_15_55_07_replicasets_linux_32 http://mci.10gen.com/ui/task/mongodb_mongo_master_linux_32_debug_adcf6601f49b8afbe5b0c2b23d0f83ceeeca1fa1_14_06_05_20_14_08_replicasets_linux_32_debug |
| Comments |
| Comment by Githook User [ 25/Jul/14 ] |
|
Author: {u'username': u'dannenberg', u'name': u'matt dannenberg', u'email': u'matt.dannenberg@10gen.com'}Message: |
| Comment by Spencer Brody (Inactive) [ 25/Jul/14 ] |
| Comment by Spencer Brody (Inactive) [ 24/Jul/14 ] |
| Comment by Spencer Brody (Inactive) [ 23/Jul/14 ] |
|
Again: |
| Comment by Randolph Tan [ 09/Jul/14 ] |
|
https://mci.10gen.com/ui/task/mongodb_mongo_master_linux_32_a61874a2de78a6f3b928b73c16f2b46f60ba2c57_14_07_09_17_49_07_replicasets_linux_32 |
| Comment by Matt Dannenberg [ 23/Jun/14 ] |
|
Haven't solved this one, but I did notice that in the failure cases, the result object from the syncFrom command has a prevSyncTarget field which contains the address of the secondary which we would like to target and in the successful ones it does not contain this field for the first few results. Also in the failures, I see attempts to connect to the secondary prior to the printout about forcing the node to do so and I do not think the node should be trying to connect to the secondary before we tell it to (since that's the point of noChainingAllowed). |
| Comment by David Storch [ 18/Jun/14 ] |
| Comment by Shaun Verch [ 11/Jun/14 ] |
| Comment by Mark Benvenuto [ 09/Jun/14 ] |
|
http://buildlogs.mongodb.org/mci_0.9_linux-32/builds/50392/test/replicasets_0/no_chaining.js |