Details
-
Bug
-
Status: Closed
-
Major - P3
-
Resolution: Duplicate
-
3.4.3
-
None
-
None
-
Replication
-
ALL
-
Description
Initial sync from a MMAPv1 replicaset to a new WT secondary, both running 3.4.3:
2017-04-10T06:00:01.604+0000 I CONTROL [signalProcessingThread] db version v3.4.3 |
2017-04-10T06:00:01.604+0000 I CONTROL [signalProcessingThread] git version: f07437fb5a6cca07c10bafa78365456eb1d6d5e1 |
2017-04-10T06:00:01.604+0000 I CONTROL [signalProcessingThread] OpenSSL version: OpenSSL 1.0.1f 6 Jan 2014 |
2017-04-10T06:00:01.604+0000 I CONTROL [signalProcessingThread] allocator: tcmalloc |
2017-04-10T06:00:01.604+0000 I CONTROL [signalProcessingThread] modules: none |
2017-04-10T06:00:01.604+0000 I CONTROL [signalProcessingThread] build environment: |
2017-04-10T06:00:01.604+0000 I CONTROL [signalProcessingThread] distmod: ubuntu1404 |
2017-04-10T06:00:01.604+0000 I CONTROL [signalProcessingThread] distarch: x86_64 |
2017-04-10T06:00:01.604+0000 I CONTROL [signalProcessingThread] target_arch: x86_64 |
The initial copy of the data was successful, but the oplog catchup stage failed due to an update that breached the 16MB limit:
2017-04-06T16:04:04.054+0000 F REPL [repl writer worker 11] warning: log line attempted (147kB) over max size (10kB), printing beginning and end ... writer worker caught exception: :: caused by :: 17
|
419 Resulting document after update is larger than 16777216 on: { ts: Timestamp 1491429857000|404, h: 8450293174279352223, v: 2, op: "u", ns: "[SNIP - large update]", o2: { _id: ObjectId('58961877c14
|
1580404e7f604') }, o: { $set: { [SNIP] } }
|
2017-04-06T16:04:04.055+0000 I REPL [repl writer worker 8] missing object not found on source. presumably deleted later in oplog
|
2017-04-06T16:04:04.055+0000 I REPL [repl writer worker 8] o2: { _id: ObjectId('58e4e2e9bc44b6c99ed9bb2b') }
|
2017-04-06T16:04:04.055+0000 I REPL [repl writer worker 8] o firstfield: $set
|
2017-04-06T16:04:04.067+0000 E REPL [replication-59] Failed to apply batch due to 'Location17419: error applying batch: Resulting document after update is larger than 16777216'
|
2017-04-06T16:04:04.075+0000 I REPL [replication-61] Finished fetching oplog during initial sync: CallbackCanceled: fetcher was shut down after previous batch was processed. Last fetched optime and ha
|
sh: { ts: Timestamp 1491494643000|612, t: -1 }[1653548606848978015]
|
2017-04-06T16:04:04.075+0000 I REPL [replication-61] Initial sync attempt finishing up.
|
2017-04-06T16:04:04.082+0000 I REPL [replication-61] warning: log line attempted (491kB) over max size (10kB), printing beginning and end ... Initial Sync Attempt Statistics: { failedInitialSyncAttemp
|
ts: 0, maxFailedInitialSyncAttempts: 10, initialSyncStart: new Date(1491397073200), initialSyncAttempts: [], fetchedMissingDocs: 110321, appliedOps: 27140000, initialSyncOplogStart: Timestamp 149139707400
|
0|51, initialSyncOplogEnd: Timestamp 1491494643000|612, databases: { databasesCloned: 44, [SNIP]: { collections: 1, clonedCollections: 1, start: new Date(1491397074124), end: new Date(1491397074208), elapsedMillis: 84, [SNIP]: { documentsToCopy: 14, documentsCopied: 14, indexes: 1, fetchedBatches: 1, start: new Date(1491397074124), end: new Date(1491397074208), elapsedMillis: 84 } }, [SNIP]: { collections: 51, clonedCollections: 51, start: new Date(1491397074208), end: new Date(1491397729132), elapsedMillis: 654924, [SNIP]: { documentsToCopy: 1, documentsCopied: 1, indexes: 1, fetchedBatches: 1, start: new Date(1491397074209), end: new Date(1491397074287), elapsedMillis: 78 }, [SNIP]: { documentsToCopy: 310824, documentsCopied: 310834, indexes: 2, fetchedBatches: 129, start: new Date(1491397074287), end: new Date(1491397082539), elapsedMillis: 8252 }, [SNIP]: { documentsToCopy: 2236029, documentsCopied: 2236405, indexes: 2, fetchedBatches: 109, start: new Date(1491397082539), end: new Date(1491397133928), elapsedMillis: 51389 }, [SNIP]: { documentsToCopy: 30770, documentsCopied: 30770, indexes: 2, fetchedBatches: 1, start: new Date(1491397133928), end: new Date(1491397134604), elapsedMillis: 676 }, [SNIP]: { documentsToCopy: 1188, documentsCopied: 1188, indexes: 1, fetchedBatches: 1, start: new Date(1491397134604), end: new Date(1491397134679), elapsedMillis: 75 }, [SNIP]: { documentsToCopy: 111810, documentsCopied: 111811, indexes: 2, fetchedBatches: 5, start: new Date(1491397134679), end: new Date(1491397137511), elapsedMillis: 2832 }, [SNIP]: { documentsToCopy: 16767, documentsCopied: 16767, indexes: 1, fetchedBatches: 1, start: new Date(1491397137511), end: new Date(1491397137922), elapsedMillis: 411 }, [SNIP]: { documentsToCopy: 85965, documentsCopied: 85969, indexes: 2, fetchedBatches: 11, start: new Date(1491397137921), end: new Date(1491397146202), elapsedMillis: 8281 }, [SNIP]: { documentsToCopy: 67820, documentsCopied: 67826, indexes: 2, fetchedBatches: 15, start: new Date(1491397146202), end: new Date(1491397156706), elapsedMillis: 10504 }, [SNIP]: { documentsToCopy: 5, documentsCopied: 5, indexes: 2, fetchedBatches: 1, start: new Date(1491397156706), end: new Date(1491397156765), elapsedMillis: 59 }, [SNIP]: { documentsToCopy: 18827, documentsCopied: 18827, indexes: 2, fetchedBatches: 1, start: new Date(1491397156765), end: new Date(1491397157548), elapsedMillis: 783 }, [SNIP]: { documentsToCopy: 9603, documentsCopied: 9603, indexes: 1, fetchedBatches: 1, start: new Date(1491397157547), end: new Date(1491397157821), elapsedMillis: 274 }, [SNIP]: { documentsToCopy: 86, documentsCopied: 86, indexes: 1, fetchedBatches: 1, start: new Date(1491397157821), end: new Date(1491397157842), e .......... alerts.krwfihep_client_alerts: { documentsToCopy: 0, documentsCopied: 0, indexes: 7, fetchedBatches: 0, start: new Date(1491492468641), end: new Date(1491492468683), elapsedMillis: 42 }, [SNIP]: { documentsToCopy: 513, documentsCopied: 513, indexes: 7, fetchedBatches: 1, start: new Date(1491492468683), end: new Date(1491492468943), elapsedMillis: 260 }, [SNIP]: { documentsToCopy: 7, documentsCopied: 7, indexes: 7, fetchedBatches: 1, start: new Date(1491492468943), end: new Date(1491492469036), elapsedMillis: 93 }, [SNIP]: { documentsToCopy: 3245, documentsCopied: 3245, indexes: 7, fetchedBatches: 1, start: new Date(1491492469036), end: new Date(1491492470635), elapsedMillis: 1599 }, [SNIP]: { documentsToCopy: 921, documentsCopied: 921, indexes: 7, fetchedBatches: 1, start: new Date(1491492470635), end: new Date(1491492470715), elapsedMillis: 80 }, [SNIP]: { documentsToCopy: 1747, documentsCopied: 1747, indexes: 7, fetchedBatches: 1, start: new Date(1491492470715), end: new Date(1491492470993), elapsedMillis: 278 }, [SNIP]: { documentsToCopy: 0, documentsCopied: 0, indexes: 7, fetchedBatches: 0, start: new Date(1491492470993), end: new Date(1491492471052), elapsedMillis: 59 }, [SNIP]: { documentsToCopy: 21, documentsCopied: 21, indexes: 7, fetchedBatches: 1, start: new Date(1491492471052), end: new Date(1491492471102), elapsedMillis: 50 }, [SNIP]: { documentsToCopy: 0, documentsCopied: 0, indexes: 7, fetchedBatches: 0, start: new Date(1491492471102), end: new Date(1491492471168), elapsedMillis: 66 }, [SNIP]: { documentsToCopy: 4, documentsCopied: 4, indexes: 7, fetchedBatches: 1, start: new Date(1491492471168), end: new Date(1491492471241), elapsedMillis: 73 }, [SNIP]: { documentsToCopy: 6, documentsCopied: 6, indexes: 7, fetchedBatches: 1, start: new Date(1491492471240), end: new Date(1491492471317), elapsedMillis: 77 }, [SNIP]: { documentsToCopy: 0, documentsCopied: 0, indexes: 7, fetchedBatches: 0, start: new Date(1491492471317), end: new Date(1491492471389), elapsedMillis: 72 }, [SNIP]: { documentsToCopy: 0, documentsCopied: 0, indexes: 7, fetchedBatches: 0, start: new Date(1491492471389), end: new Date(1491492471468), elapsedMillis: 79 }, [SNIP]: { documentsToCopy: 1385, documentsCopied: 1385, indexes: 7, fetchedBatches: 1, start: new Date(1491492471468), end: new Date(1491492471827), elapsedMillis: 359 }, [SNIP]: { documentsToCopy: 598, documentsCopied: 598, indexes: 7, fetchedBatches: 1, start: new Date(1491492471827), end: new Date(1491492472019), elapsedMillis: 192 }, [SNIP]: { documentsToCopy: 100, documentsCopied: 100, indexes: 7, fetchedBatches: 1, start: new Date(1491492472019), end: new Date(1491492472094), elapsedMillis: 75 }, [SNIP]: { documentsToCopy: 79333, documentsCopied: 79333, indexes: 7, fetchedBatches: 14, start: new Date(1491492472094), end: new Date(1491492492457), elapsedMillis: 20363 } } } }
|
2017-04-06T16:04:04.082+0000 E REPL [replication-61] Initial sync attempt failed -- attempts left: 9 cause: Location17419: error applying batch: Resulting document after update is larger than 16777216
|
2017-04-06T16:04:05.083+0000 I REPL [replication-59] Starting initial sync (attempt 2 of 10)
|
2017-04-06T16:04:05.133+0000 I REPL [replication-59] sync source candidate: mongodbsvr1-p2:27028
|
2017-04-06T16:04:06.220+0000 I STORAGE [replication-59] dropAllDatabasesExceptLocal 44
|
2017-04-06T16:04:24.446+0000 I REPL [replication-59] ******
|
2017-04-06T16:04:24.446+0000 I REPL [replication-59] creating replication oplog of size: 51200MB...
|
2017-04-06T16:04:24.451+0000 I STORAGE [replication-59] WiredTigerRecordStoreThread local.oplog.rs already started
|
2017-04-06T16:04:24.451+0000 I STORAGE [replication-59] The size storer reports that the oplog contains 0 records totaling to 0 bytes
|
2017-04-06T16:04:24.451+0000 I STORAGE [replication-59] Scanning the oplog to determine where to place markers for truncation
|
2017-04-06T16:04:24.515+0000 I REPL [replication-59] ******
|
This then caused the whole initialsync process to fail, dropping all the data and starting again from the beginning. If there is an entry in the oplog that the MMAP engine coped with (to replicate to the secondary), then the initial sync should cope in the same way (rejecting the update and carrying on?).
Attachments
Issue Links
- duplicates
-
SERVER-7515 idempotence violation when intermediate document exceeds size threshold
-
- Closed
-