[SERVER-14690] no_chaining.js is flakey Created: 06/Jun/14  Updated: 29/Jul/14  Resolved: 25/Jul/14

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 2.7.4

Type: Improvement Priority: Critical - P2
Reporter: Mark Benvenuto Assignee: Matt Dannenberg
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
Participants:
Linked BF Score: 0

 Description   

The problem was a race the forceSync() assert.soon, whereby the node would not have a chance to begin syncing before being instructed to change sync source again. This failure was only observed on linux32 and linux32-debug since they have slower builders. The fix involves checking for replication progress in an assert.soon inside the original assert.soon before retrying.

Here are the failures. It seems to be some sort of startup failure. It is likely 32-bit specific given this set of failures.

http://mci.10gen.com/ui/task/mongodb_mongo_master_linux_32_43d2ae25b1872273cb227ada251315cbaf817534_14_06_06_15_55_07_replicasets_linux_32
http://buildlogs.mongodb.org/build/5391f913d2a60f4011000429/test/5391ff19d2a60f4566000838/

http://mci.10gen.com/ui/task/mongodb_mongo_master_linux_32_debug_adcf6601f49b8afbe5b0c2b23d0f83ceeeca1fa1_14_06_05_20_14_08_replicasets_linux_32_debug
http://buildlogs.mongodb.org/build/5390dcb1d2a60f3104000245/test/5390e483d2a60f3481000af7/



 Comments   
Comment by Githook User [ 25/Jul/14 ]

Author:

{u'username': u'dannenberg', u'name': u'matt dannenberg', u'email': u'matt.dannenberg@10gen.com'}

Message: SERVER-14690 fix no_chaining.js flakiness
Branch: master
https://github.com/mongodb/mongo/commit/a09163ade822ce97e9bbf570bccea9781c37603d

Comment by Spencer Brody (Inactive) [ 25/Jul/14 ]

https://mci.10gen.com/ui/task/mongodb_mongo_master_linux_32_debug_e622765f1105b849c746acfce0c2ccc2eaed92e7_14_07_25_05_50_10_replicasets_linux_32_debug

Comment by Spencer Brody (Inactive) [ 24/Jul/14 ]

https://mci.10gen.com/ui/task/mongodb_mongo_master_linux_32_4f01ffd9cbbb0dcb4ee2f9167ad2337178259ba9_14_07_23_22_52_07_replicasets_linux_32

Comment by Spencer Brody (Inactive) [ 23/Jul/14 ]

Again:
https://mci.10gen.com/ui/task/mongodb_mongo_master_linux_32_c8f3b90326a4a958ccd1ca90a2b4e03c33476e7f_14_07_23_18_25_06_replicasets_linux_32
http://buildlogs.mongodb.org/mci_0.9_linux-32/builds/57809/test/replicasets_0/no_chaining.js

Comment by Randolph Tan [ 09/Jul/14 ]

https://mci.10gen.com/ui/task/mongodb_mongo_master_linux_32_a61874a2de78a6f3b928b73c16f2b46f60ba2c57_14_07_09_17_49_07_replicasets_linux_32
http://buildlogs.mongodb.org/mci_0.9_linux-32/builds/55441/test/replicasets_0/no_chaining.js

Comment by Matt Dannenberg [ 23/Jun/14 ]

Haven't solved this one, but I did notice that in the failure cases, the result object from the syncFrom command has a prevSyncTarget field which contains the address of the secondary which we would like to target and in the successful ones it does not contain this field for the first few results.

Also in the failures, I see attempts to connect to the secondary prior to the printout about forcing the node to do so and I do not think the node should be trying to connect to the secondary before we tell it to (since that's the point of noChainingAllowed).

Comment by David Storch [ 18/Jun/14 ]

https://mci.10gen.com/ui/task/mongodb_mongo_master_linux_32_3947048e1f5fc567022af15c050d1b80160b25ce_14_06_17_21_59_06_replicasets_linux_32

Comment by Shaun Verch [ 11/Jun/14 ]

https://mci.10gen.com/ui/task/mongodb_mongo_master_linux_32_3e39c96ebbd90ebeb91d46f9dace6988a0152763_14_06_11_15_58_05_replicasets_linux_32

Comment by Mark Benvenuto [ 09/Jun/14 ]

http://buildlogs.mongodb.org/mci_0.9_linux-32/builds/50392/test/replicasets_0/no_chaining.js
http://mci.10gen.com/ui/task/mongodb_mongo_master_linux_32_baf952e06f3288dc9bd1e5dd7b2fb683195feff2_14_06_09_20_37_06_replicasets_linux_32

Generated at Thu Feb 08 03:35:40 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.