[SERVER-21282] server6733.js failing on Linux noPassthrough Created: 30/Oct/15  Updated: 17/Nov/15  Resolved: 05/Nov/15

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 3.2.0-rc3

Type: Bug Priority: Major - P3
Reporter: Geert Bosch Assignee: Matt Dannenberg
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: Repl B (10/30/15), Repl C (11/20/15)
Participants:

 Description   

https://evergreen.mongodb.com/task/mongodb_mongo_master_linux_64_noPassthrough_1ba041ab5d54d361fd04c0957493354caa99097b_15_10_29_23_01_52



 Comments   
Comment by Matt Dannenberg [ 05/Nov/15 ]

Fixed by removing the test. It will be replaced with a simple unittest once the DataReplicator exists.

Comment by Githook User [ 05/Nov/15 ]

Author:

{u'username': u'dannenberg', u'name': u'matt dannenberg', u'email': u'matt.dannenberg@10gen.com'}

Message: SERVER-21282 remove noPassthrough/server6733.js as it is an unstable and roundabout way to test socket timeout and sync source reevaluation
Branch: master
https://github.com/mongodb/mongo/commit/ec9bc201e2a309aadecbab27a868c2c05c24f18c

Comment by Spencer Brody (Inactive) [ 05/Nov/15 ]

Still failing quite regularly. Here's a random recent failure: https://evergreen.mongodb.com/task/mongodb_mongo_master_linux_64_debug_noPassthrough_d70512ab5f655868b409bcbf92d1e68af80e68d8_15_11_04_23_49_12

Comment by Spencer Brody (Inactive) [ 04/Nov/15 ]

Still failing
https://evergreen.mongodb.com/task/mongodb_mongo_master_linux_64_noPassthrough_e29401fe31d95322d7b2ab58dfc94ce04f27d5a1_15_11_04_15_25_20

	
[js_test:server6733] 2015-11-04T16:14:11.243+0000 Force A to sync from B
[js_test:server6733] 2015-11-04T16:14:11.269+0000 d20762| 2015-11-04T16:14:11.244+0000 I INDEX    [repl writer worker 0] allocating new ns file /data/db/job3/mongorunner/testSet-2/foo.ns, filling with zeroes...
[js_test:server6733] 2015-11-04T16:14:11.270+0000 d20761| 2015-11-04T16:14:11.257+0000 I REPL     [ReplicationExecutor] syncing from: ip-10-65-72-184:20762 by request
[js_test:server6733] 2015-11-04T16:14:11.295+0000 d20761| 2015-11-04T16:14:11.280+0000 I INDEX    [repl writer worker 0] allocating new ns file /data/db/job3/mongorunner/testSet-1/foo.ns, filling with zeroes...
[js_test:server6733] 2015-11-04T16:14:11.297+0000 d20762| 2015-11-04T16:14:11.273+0000 I NETWORK  [initandlisten] connection accepted from 10.65.72.184:52469 #6 (4 connections now open)
[js_test:server6733] 2015-11-04T16:14:11.359+0000 d20762| 2015-11-04T16:14:11.356+0000 I STORAGE  [FileAllocator] allocating new datafile /data/db/job3/mongorunner/testSet-2/foo.0, filling with zeroes...
[js_test:server6733] 2015-11-04T16:14:11.418+0000 d20762| 2015-11-04T16:14:11.393+0000 I STORAGE  [FileAllocator] done allocating datafile /data/db/job3/mongorunner/testSet-2/foo.0, size: 16MB,  took 0.036 secs
[js_test:server6733] 2015-11-04T16:14:11.463+0000 d20761| 2015-11-04T16:14:11.416+0000 I STORAGE  [FileAllocator] allocating new datafile /data/db/job3/mongorunner/testSet-1/foo.0, filling with zeroes...
[js_test:server6733] 2015-11-04T16:14:11.463+0000 d20761| 2015-11-04T16:14:11.418+0000 I REPL     [SyncSourceFeedback] setting syncSourceFeedback to ip-10-65-72-184:20762
[js_test:server6733] 2015-11-04T16:14:11.464+0000 d20761| 2015-11-04T16:14:11.419+0000 I ASIO     [NetworkInterfaceASIO] Successfully connected to ip-10-65-72-184:20762
[js_test:server6733] 2015-11-04T16:14:11.464+0000 d20761| 2015-11-04T16:14:11.420+0000 I REPL     [rsBackgroundSync] starting rollback: RemoteOplogStale we are ahead of the sync source, will try to roll back
[js_test:server6733] 2015-11-04T16:14:11.464+0000 d20762| 2015-11-04T16:14:11.418+0000 I NETWORK  [conn6] end connection 10.65.72.184:52469 (3 connections now open)
[js_test:server6733] 2015-11-04T16:14:11.465+0000 d20762| 2015-11-04T16:14:11.419+0000 I NETWORK  [initandlisten] connection accepted from 10.65.72.184:52470 #7 (4 connections now open)
[js_test:server6733] 2015-11-04T16:14:11.465+0000 d20762| 2015-11-04T16:14:11.419+0000 I NETWORK  [initandlisten] connection accepted from 10.65.72.184:52471 #8 (5 connections now open)
[js_test:server6733] 2015-11-04T16:14:11.465+0000 d20760| 2015-11-04T16:14:11.418+0000 I NETWORK  [conn12] end connection 10.65.72.184:51461 (6 connections now open)
[js_test:server6733] 2015-11-04T16:14:11.466+0000 d20761| 2015-11-04T16:14:11.447+0000 I STORAGE  [FileAllocator] done allocating datafile /data/db/job3/mongorunner/testSet-1/foo.0, size: 16MB,  took 0.03 secs
[js_test:server6733] 2015-11-04T16:14:11.475+0000 d20761| 2015-11-04T16:14:11.474+0000 F REPL     [rsBackgroundSync] need to rollback, but in inconsistent state
[js_test:server6733] 2015-11-04T16:14:11.475+0000 d20761| 2015-11-04T16:14:11.474+0000 I -        [rsBackgroundSync] Fatal assertion 28723 UnrecoverableRollbackError need to rollback, but in inconsistent state. minvalid: (term: 1, timestamp: Nov  4 16:14:11:1) > our last optime: (term: 1, timestamp: Nov  4 16:14:08:2) @ 18750
[js_test:server6733] 2015-11-04T16:14:11.476+0000 d20761| 2015-11-04T16:14:11.474+0000 I -        [rsBackgroundSync]
[js_test:server6733] 2015-11-04T16:14:11.476+0000 d20761|
[js_test:server6733] 2015-11-04T16:14:11.476+0000 d20761| ***aborting after fassert() failure
[js_test:server6733] 2015-11-04T16:14:11.476+0000 d20761|
[js_test:server6733] 2015-11-04T16:14:11.477+0000 d20761|
[js_test:server6733] 2015-11-04T16:14:11.495+0000 d20760| 2015-11-04T16:14:11.495+0000 I NETWORK  [conn13] end connection 10.65.72.184:51462 (5 connections now open)
[js_test:server6733] 2015-11-04T16:14:11.496+0000 d20762| 2015-11-04T16:14:11.495+0000 I NETWORK  [conn5] end connection 10.65.72.184:52454 (4 connections now open)
[js_test:server6733] 2015-11-04T16:14:11.496+0000 d20762| 2015-11-04T16:14:11.495+0000 I NETWORK  [conn8] end connection 10.65.72.184:52471 (3 connections now open)
[js_test:server6733] 2015-11-04T16:14:11.496+0000 d20762| 2015-11-04T16:14:11.495+0000 I NETWORK  [conn7] end connection 10.65.72.184:52470 (2 connections now open)
[js_test:server6733] 2015-11-04T16:14:11.497+0000 d20760| 2015-11-04T16:14:11.497+0000 I NETWORK  [conn2] end connection 10.65.72.184:51443 (4 connections now open)
[js_test:server6733] 2015-11-04T16:14:12.119+0000 d20762| 2015-11-04T16:14:12.119+0000 I REPL     [ReplicationExecutor] Error in heartbeat request to ip-10-65-72-184:20761; HostUnreachable End of file
[js_test:server6733] 2015-11-04T16:14:12.120+0000 d20762| 2015-11-04T16:14:12.119+0000 I REPL     [ReplicationExecutor] Error in heartbeat request to ip-10-65-72-184:20761; HostUnreachable End of file
[js_test:server6733] 2015-11-04T16:14:12.125+0000 d20762| 2015-11-04T16:14:12.125+0000 I ASIO     [NetworkInterfaceASIO] Failed to connect to ip-10-65-72-184:20761 - HostUnreachable Connection refused
[js_test:server6733] 2015-11-04T16:14:12.126+0000 d20762| 2015-11-04T16:14:12.125+0000 I REPL     [ReplicationExecutor] Error in heartbeat request to ip-10-65-72-184:20761; HostUnreachable Connection refused
[js_test:server6733] 2015-11-04T16:14:12.245+0000 2015-11-04T16:14:12.244+0000 E QUERY    [thread1] Error: error doing query: failed :
[js_test:server6733] 2015-11-04T16:14:12.245+0000 DB.prototype.runCommand@src/mongo/shell/db.js:132:1
[js_test:server6733] 2015-11-04T16:14:12.245+0000 @jstests/noPassthrough/server6733.js:34:13
[js_test:server6733] 2015-11-04T16:14:12.246+0000 assert.soon@src/mongo/shell/assert.js:194:17
[js_test:server6733] 2015-11-04T16:14:12.246+0000 @jstests/noPassthrough/server6733.js:32:1
[js_test:server6733] 2015-11-04T16:14:12.246+0000 
[js_test:server6733] 2015-11-04T16:14:12.246+0000 failed to load: jstests/noPassthrough/server6733.js

Comment by Githook User [ 04/Nov/15 ]

Author:

{u'username': u'dannenberg', u'name': u'matt dannenberg', u'email': u'matt.dannenberg@10gen.com'}

Message: SERVER-21282 insert after the spanning tree is established in noPassthrough/server6733.js
Branch: master
https://github.com/mongodb/mongo/commit/12bb61d523692bc0fee0d8f9b3e99889dd13a81e

Comment by Eric Milkie [ 03/Nov/15 ]

It seems to me like this could be fixed if we made sure we did "an initial write" after the secondary nodes have done initial sync. Seems like the write may happen before initial sync finishes, in which case node B doesn't get a sync source later on because there is no new data to sync. Makes it hard for A to sync from B then.

Comment by Matt Dannenberg [ 03/Nov/15 ]

milkie I believe this is fallout from "SERVER-20804 make shouldChangeSyncSource return true if our sync source has no sync source and is not ahead of us"

We patched the other tests that used syncFrom so that the test would check for the change on the node who was supposed to change sync source. Here that is already the case. I'm not sure what we should do, but I fear we may want to re-work how syncFrom works?

Comment by Geert Bosch [ 03/Nov/15 ]

assert.soon failed, msg:A refused to sync from B
New occurrence today: https://evergreen.mongodb.com/task/mongodb_mongo_master_linux_64_noPassthrough_29fcc4fc6aff21aa1ef7df1585effb050f764f29_15_11_03_19_10_56

Generated at Thu Feb 08 03:56:52 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.