|
There is a known failure that can happen when running this passthrough which will be fixed after SERVER-32767.
The failure:
[CheckReplOplogs:job1:basic5:CheckReplOplogs] 2018-04-09T21:53:06.663+0000 assert failed : checkOplogs, non-matching oplog entries for the following nodes:
|
[CheckReplOplogs:job1:basic5:CheckReplOplogs] 2018-04-09T21:53:06.670+0000 localhost:20250: { "ts" : Timestamp(1523310782, 1), "t" : NumberLong(19), "h" : NumberLong("-1287453526515208947"), "v" : 2, "op" : "c", "ns" : "test.$cmd", "ui" : UUID("1090b5b8-17e1-486e-92d4-da3c23278d29"), "wall" : ISODate("2018-04-09T21:53:02.291Z"), "o" : { "create" : "basic5", "idIndex" : { "v" : 2, "key" : { "_id" : 1 }, "name" : "_id_", "ns" : "test.basic5" } } }
|
[CheckReplOplogs:job1:basic5:CheckReplOplogs] 2018-04-09T21:53:06.679+0000 localhost:20251: { "ts" : Timestamp(1523310785, 11), "t" : NumberLong(20), "h" : NumberLong("3072770154973711623"), "v" : 2, "op" : "c", "ns" : "test.$cmd", "ui" : UUID("ac796c39-3fd2-480c-8a51-be5dc075038b"), "o2" : { "collectionOptions_old" : { "uuid" : UUID("ac796c39-3fd2-480c-8a51-be5dc075038b"), "flags" : 1 } }, "wall" : ISODate("2018-04-09T21:53:05.520Z"), "o" : { "collMod" : "ne1", "usePowerOf2Sizes" : true } }
|
[CheckReplOplogs:job1:basic5:CheckReplOplogs] 2018-04-09T21:53:06.696+0000 doassert@src/mongo/shell/assert.js:18:14
|
[CheckReplOplogs:job1:basic5:CheckReplOplogs] 2018-04-09T21:53:06.704+0000 assert@src/mongo/shell/assert.js:146:9
|
[CheckReplOplogs:job1:basic5:CheckReplOplogs] 2018-04-09T21:53:06.705+0000 checkOplogs@src/mongo/shell/replsettest.js:1794:25
|
[CheckReplOplogs:job1:basic5:CheckReplOplogs] 2018-04-09T21:53:06.705+0000 ReplSetTest/this.checkReplicaSet@src/mongo/shell/replsettest.js:1378:13
|
[CheckReplOplogs:job1:basic5:CheckReplOplogs] 2018-04-09T21:53:06.705+0000 ReplSetTest/this.checkOplogs@src/mongo/shell/replsettest.js:1700:9
|
[CheckReplOplogs:job1:basic5:CheckReplOplogs] 2018-04-09T21:53:06.706+0000 @jstests/hooks/run_check_repl_oplogs.js:17:5
|
[CheckReplOplogs:job1:basic5:CheckReplOplogs] 2018-04-09T21:53:06.706+0000 @jstests/hooks/run_check_repl_oplogs.js:5:2
|
It's caused by:
[CheckReplOplogs:job1:basic5:CheckReplOplogs] 2018-04-09T21:53:05.613+0000 ReplSetTest Could not call ismaster on node connection to localhost:20250: Error: error doing query: failed: network error while attempting to run command 'ismaster' on host 'localhost:20250'
|
[CheckReplOplogs:job1:basic5:CheckReplOplogs] 2018-04-09T21:53:05.618+0000 ReplSetTest awaitReplication: starting: optime for primary, localhost:20252, is { "ts" : Timestamp(1523310785, 11), "t" : NumberLong(20) }
|
[CheckReplOplogs:job1:basic5:CheckReplOplogs] 2018-04-09T21:53:05.618+0000 ReplSetTest awaitReplication: checking secondaries against latest primary optime { "ts" : Timestamp(1523310785, 11), "t" : NumberLong(20) }
|
[CheckReplOplogs:job1:basic5:CheckReplOplogs] 2018-04-09T21:53:05.623+0000 ReplSetTest awaitReplication: checking secondary #1: localhost:20251
|
[CheckReplOplogs:job1:basic5:CheckReplOplogs] 2018-04-09T21:53:05.625+0000 ReplSetTest awaitReplication: optime for secondary #1, localhost:20251, is { "ts" : Timestamp(1523310785, 1), "t" : NumberLong(20) } but latest is { "ts" : Timestamp(1523310785, 11), "t" : NumberLong(20) }
|
[CheckReplOplogs:job1:basic5:CheckReplOplogs] 2018-04-09T21:53:05.625+0000 ReplSetTest awaitReplication: secondary #1, localhost:20251, is NOT synced
|
[CheckReplOplogs:job1:basic5:CheckReplOplogs] 2018-04-09T21:53:05.826+0000 ReplSetTest awaitReplication: checking secondaries against latest primary optime { "ts" : Timestamp(1523310785, 11), "t" : NumberLong(20) }
|
[CheckReplOplogs:job1:basic5:CheckReplOplogs] 2018-04-09T21:53:05.831+0000 ReplSetTest awaitReplication: checking secondary #1: localhost:20251
|
[CheckReplOplogs:job1:basic5:CheckReplOplogs] 2018-04-09T21:53:05.833+0000 ReplSetTest awaitReplication: optime for secondary #1, localhost:20251, is { "ts" : Timestamp(1523310785, 7), "t" : NumberLong(20) } but latest is { "ts" : Timestamp(1523310785, 11), "t" : NumberLong(20) }
|
[CheckReplOplogs:job1:basic5:CheckReplOplogs] 2018-04-09T21:53:05.833+0000 ReplSetTest awaitReplication: secondary #1, localhost:20251, is NOT synced
|
[CheckReplOplogs:job1:basic5:CheckReplOplogs] 2018-04-09T21:53:06.041+0000 ReplSetTest awaitReplication: checking secondaries against latest primary optime { "ts" : Timestamp(1523310785, 11), "t" : NumberLong(20) }
|
[CheckReplOplogs:job1:basic5:CheckReplOplogs] 2018-04-09T21:53:06.049+0000 ReplSetTest awaitReplication: checking secondary #1: localhost:20251
|
[CheckReplOplogs:job1:basic5:CheckReplOplogs] 2018-04-09T21:53:06.050+0000 ReplSetTest awaitReplication: secondary #1, localhost:20251, is synced
|
[CheckReplOplogs:job1:basic5:CheckReplOplogs] 2018-04-09T21:53:06.055+0000 ReplSetTest awaitReplication: finished: all 1 secondaries synced at optime { "ts" : Timestamp(1523310785, 11), "t" : NumberLong(20) }
|
The primary (node 0) was killed, restarted and goes into rollback. At the same time, the test finishes and calls CheckReplOplog. During awaitReplication, ReplSetTest calls ismaster on node 0 but gets a network error while doing so. Now awaitReplication doesn't wait for node 0 to be replicated because it isn't part of the liveNodes list, so it returns before node 0 is synced. CheckReplOplog then fails because node 0 and the new primary have non-matching oplogs.
|