-
Type: Bug
-
Resolution: Duplicate
-
Priority: Major - P3
-
None
-
Affects Version/s: 3.4.3
-
Component/s: Replication
-
None
-
Replication
-
ALL
-
Initial sync from a MMAPv1 replicaset to a new WT secondary, both running 3.4.3:
2017-04-10T06:00:01.604+0000 I CONTROL [signalProcessingThread] db version v3.4.3 2017-04-10T06:00:01.604+0000 I CONTROL [signalProcessingThread] git version: f07437fb5a6cca07c10bafa78365456eb1d6d5e1 2017-04-10T06:00:01.604+0000 I CONTROL [signalProcessingThread] OpenSSL version: OpenSSL 1.0.1f 6 Jan 2014 2017-04-10T06:00:01.604+0000 I CONTROL [signalProcessingThread] allocator: tcmalloc 2017-04-10T06:00:01.604+0000 I CONTROL [signalProcessingThread] modules: none 2017-04-10T06:00:01.604+0000 I CONTROL [signalProcessingThread] build environment: 2017-04-10T06:00:01.604+0000 I CONTROL [signalProcessingThread] distmod: ubuntu1404 2017-04-10T06:00:01.604+0000 I CONTROL [signalProcessingThread] distarch: x86_64 2017-04-10T06:00:01.604+0000 I CONTROL [signalProcessingThread] target_arch: x86_64
The initial copy of the data was successful, but the oplog catchup stage failed due to an update that breached the 16MB limit:
2017-04-06T16:04:04.054+0000 F REPL [repl writer worker 11] warning: log line attempted (147kB) over max size (10kB), printing beginning and end ... writer worker caught exception: :: caused by :: 17 419 Resulting document after update is larger than 16777216 on: { ts: Timestamp 1491429857000|404, h: 8450293174279352223, v: 2, op: "u", ns: "[SNIP - large update]", o2: { _id: ObjectId('58961877c14 1580404e7f604') }, o: { $set: { [SNIP] } } 2017-04-06T16:04:04.055+0000 I REPL [repl writer worker 8] missing object not found on source. presumably deleted later in oplog 2017-04-06T16:04:04.055+0000 I REPL [repl writer worker 8] o2: { _id: ObjectId('58e4e2e9bc44b6c99ed9bb2b') } 2017-04-06T16:04:04.055+0000 I REPL [repl writer worker 8] o firstfield: $set 2017-04-06T16:04:04.067+0000 E REPL [replication-59] Failed to apply batch due to 'Location17419: error applying batch: Resulting document after update is larger than 16777216' 2017-04-06T16:04:04.075+0000 I REPL [replication-61] Finished fetching oplog during initial sync: CallbackCanceled: fetcher was shut down after previous batch was processed. Last fetched optime and ha sh: { ts: Timestamp 1491494643000|612, t: -1 }[1653548606848978015] 2017-04-06T16:04:04.075+0000 I REPL [replication-61] Initial sync attempt finishing up. 2017-04-06T16:04:04.082+0000 I REPL [replication-61] warning: log line attempted (491kB) over max size (10kB), printing beginning and end ... Initial Sync Attempt Statistics: { failedInitialSyncAttemp ts: 0, maxFailedInitialSyncAttempts: 10, initialSyncStart: new Date(1491397073200), initialSyncAttempts: [], fetchedMissingDocs: 110321, appliedOps: 27140000, initialSyncOplogStart: Timestamp 149139707400 0|51, initialSyncOplogEnd: Timestamp 1491494643000|612, databases: { databasesCloned: 44, [SNIP]: { collections: 1, clonedCollections: 1, start: new Date(1491397074124), end: new Date(1491397074208), elapsedMillis: 84, [SNIP]: { documentsToCopy: 14, documentsCopied: 14, indexes: 1, fetchedBatches: 1, start: new Date(1491397074124), end: new Date(1491397074208), elapsedMillis: 84 } }, [SNIP]: { collections: 51, clonedCollections: 51, start: new Date(1491397074208), end: new Date(1491397729132), elapsedMillis: 654924, [SNIP]: { documentsToCopy: 1, documentsCopied: 1, indexes: 1, fetchedBatches: 1, start: new Date(1491397074209), end: new Date(1491397074287), elapsedMillis: 78 }, [SNIP]: { documentsToCopy: 310824, documentsCopied: 310834, indexes: 2, fetchedBatches: 129, start: new Date(1491397074287), end: new Date(1491397082539), elapsedMillis: 8252 }, [SNIP]: { documentsToCopy: 2236029, documentsCopied: 2236405, indexes: 2, fetchedBatches: 109, start: new Date(1491397082539), end: new Date(1491397133928), elapsedMillis: 51389 }, [SNIP]: { documentsToCopy: 30770, documentsCopied: 30770, indexes: 2, fetchedBatches: 1, start: new Date(1491397133928), end: new Date(1491397134604), elapsedMillis: 676 }, [SNIP]: { documentsToCopy: 1188, documentsCopied: 1188, indexes: 1, fetchedBatches: 1, start: new Date(1491397134604), end: new Date(1491397134679), elapsedMillis: 75 }, [SNIP]: { documentsToCopy: 111810, documentsCopied: 111811, indexes: 2, fetchedBatches: 5, start: new Date(1491397134679), end: new Date(1491397137511), elapsedMillis: 2832 }, [SNIP]: { documentsToCopy: 16767, documentsCopied: 16767, indexes: 1, fetchedBatches: 1, start: new Date(1491397137511), end: new Date(1491397137922), elapsedMillis: 411 }, [SNIP]: { documentsToCopy: 85965, documentsCopied: 85969, indexes: 2, fetchedBatches: 11, start: new Date(1491397137921), end: new Date(1491397146202), elapsedMillis: 8281 }, [SNIP]: { documentsToCopy: 67820, documentsCopied: 67826, indexes: 2, fetchedBatches: 15, start: new Date(1491397146202), end: new Date(1491397156706), elapsedMillis: 10504 }, [SNIP]: { documentsToCopy: 5, documentsCopied: 5, indexes: 2, fetchedBatches: 1, start: new Date(1491397156706), end: new Date(1491397156765), elapsedMillis: 59 }, [SNIP]: { documentsToCopy: 18827, documentsCopied: 18827, indexes: 2, fetchedBatches: 1, start: new Date(1491397156765), end: new Date(1491397157548), elapsedMillis: 783 }, [SNIP]: { documentsToCopy: 9603, documentsCopied: 9603, indexes: 1, fetchedBatches: 1, start: new Date(1491397157547), end: new Date(1491397157821), elapsedMillis: 274 }, [SNIP]: { documentsToCopy: 86, documentsCopied: 86, indexes: 1, fetchedBatches: 1, start: new Date(1491397157821), end: new Date(1491397157842), e .......... alerts.krwfihep_client_alerts: { documentsToCopy: 0, documentsCopied: 0, indexes: 7, fetchedBatches: 0, start: new Date(1491492468641), end: new Date(1491492468683), elapsedMillis: 42 }, [SNIP]: { documentsToCopy: 513, documentsCopied: 513, indexes: 7, fetchedBatches: 1, start: new Date(1491492468683), end: new Date(1491492468943), elapsedMillis: 260 }, [SNIP]: { documentsToCopy: 7, documentsCopied: 7, indexes: 7, fetchedBatches: 1, start: new Date(1491492468943), end: new Date(1491492469036), elapsedMillis: 93 }, [SNIP]: { documentsToCopy: 3245, documentsCopied: 3245, indexes: 7, fetchedBatches: 1, start: new Date(1491492469036), end: new Date(1491492470635), elapsedMillis: 1599 }, [SNIP]: { documentsToCopy: 921, documentsCopied: 921, indexes: 7, fetchedBatches: 1, start: new Date(1491492470635), end: new Date(1491492470715), elapsedMillis: 80 }, [SNIP]: { documentsToCopy: 1747, documentsCopied: 1747, indexes: 7, fetchedBatches: 1, start: new Date(1491492470715), end: new Date(1491492470993), elapsedMillis: 278 }, [SNIP]: { documentsToCopy: 0, documentsCopied: 0, indexes: 7, fetchedBatches: 0, start: new Date(1491492470993), end: new Date(1491492471052), elapsedMillis: 59 }, [SNIP]: { documentsToCopy: 21, documentsCopied: 21, indexes: 7, fetchedBatches: 1, start: new Date(1491492471052), end: new Date(1491492471102), elapsedMillis: 50 }, [SNIP]: { documentsToCopy: 0, documentsCopied: 0, indexes: 7, fetchedBatches: 0, start: new Date(1491492471102), end: new Date(1491492471168), elapsedMillis: 66 }, [SNIP]: { documentsToCopy: 4, documentsCopied: 4, indexes: 7, fetchedBatches: 1, start: new Date(1491492471168), end: new Date(1491492471241), elapsedMillis: 73 }, [SNIP]: { documentsToCopy: 6, documentsCopied: 6, indexes: 7, fetchedBatches: 1, start: new Date(1491492471240), end: new Date(1491492471317), elapsedMillis: 77 }, [SNIP]: { documentsToCopy: 0, documentsCopied: 0, indexes: 7, fetchedBatches: 0, start: new Date(1491492471317), end: new Date(1491492471389), elapsedMillis: 72 }, [SNIP]: { documentsToCopy: 0, documentsCopied: 0, indexes: 7, fetchedBatches: 0, start: new Date(1491492471389), end: new Date(1491492471468), elapsedMillis: 79 }, [SNIP]: { documentsToCopy: 1385, documentsCopied: 1385, indexes: 7, fetchedBatches: 1, start: new Date(1491492471468), end: new Date(1491492471827), elapsedMillis: 359 }, [SNIP]: { documentsToCopy: 598, documentsCopied: 598, indexes: 7, fetchedBatches: 1, start: new Date(1491492471827), end: new Date(1491492472019), elapsedMillis: 192 }, [SNIP]: { documentsToCopy: 100, documentsCopied: 100, indexes: 7, fetchedBatches: 1, start: new Date(1491492472019), end: new Date(1491492472094), elapsedMillis: 75 }, [SNIP]: { documentsToCopy: 79333, documentsCopied: 79333, indexes: 7, fetchedBatches: 14, start: new Date(1491492472094), end: new Date(1491492492457), elapsedMillis: 20363 } } } } 2017-04-06T16:04:04.082+0000 E REPL [replication-61] Initial sync attempt failed -- attempts left: 9 cause: Location17419: error applying batch: Resulting document after update is larger than 16777216 2017-04-06T16:04:05.083+0000 I REPL [replication-59] Starting initial sync (attempt 2 of 10) 2017-04-06T16:04:05.133+0000 I REPL [replication-59] sync source candidate: mongodbsvr1-p2:27028 2017-04-06T16:04:06.220+0000 I STORAGE [replication-59] dropAllDatabasesExceptLocal 44 2017-04-06T16:04:24.446+0000 I REPL [replication-59] ****** 2017-04-06T16:04:24.446+0000 I REPL [replication-59] creating replication oplog of size: 51200MB... 2017-04-06T16:04:24.451+0000 I STORAGE [replication-59] WiredTigerRecordStoreThread local.oplog.rs already started 2017-04-06T16:04:24.451+0000 I STORAGE [replication-59] The size storer reports that the oplog contains 0 records totaling to 0 bytes 2017-04-06T16:04:24.451+0000 I STORAGE [replication-59] Scanning the oplog to determine where to place markers for truncation 2017-04-06T16:04:24.515+0000 I REPL [replication-59] ******
This then caused the whole initialsync process to fail, dropping all the data and starting again from the beginning. If there is an entry in the oplog that the MMAP engine coped with (to replicate to the secondary), then the initial sync should cope in the same way (rejecting the update and carrying on?).
- duplicates
-
SERVER-7515 idempotence violation when intermediate document exceeds size threshold
- Open