-
Type:
Bug
-
Resolution: Done
-
Priority:
Major - P3
-
None
-
Affects Version/s: 2.4.8
-
Component/s: Index Maintenance, Replication
-
None
-
ALL
-
None
-
3
-
None
-
None
-
None
-
None
-
None
-
None
As a result of SERVER-12406 (or some other way) a secondary has a document that a primary does not have. Now an insert that's legal on the primary may violate a unique constraint on the secondary.
Observed behavior is that this insert causes the secondary to crash:
Fri Jan 17 19:32:00.760 [repl prefetch worker] index prefetch for op i Fri Jan 17 19:32:00.760 [rsSync] replication batch size is 1 Fri Jan 17 19:32:00.760 [repl writer worker 1] applying op: { ts: Timestamp 1390015920000|1, h: 3574278741816169728, v: 2, op: "i", ns: "cs.csku", o: { _id: { r: ObjectId('5121d0ad23957311532da2f1'), m: "Double" }, p: ObjectId('5152834b77232a55b1c298b6'), v: ObjectId('5152836577232a55b1c298b9') } } Fri Jan 17 19:32:00.760 [repl writer worker 1] User Assertion: 11000:E11000 duplicate key error index: cs.csku.$v_1_p_1 dup key: { : ObjectId('5152836577232a55b1c298b9'), : ObjectId('5152834b77232a55b1c298b6') } Fri Jan 17 19:32:00.760 [repl writer worker 1] ERROR: writer worker caught exception: E11000 duplicate key error index: cs.csku.$v_1_p_1 dup key: { : ObjectId('5152836577232a55b1c298b9'), : ObjectId('5152834b77232a55b1c298b6') } on: { ts: Timestamp 1390015920000|1, h: 3574278741816169728, v: 2, op: "i", ns: "cs.csku", o: { _id: { r: ObjectId('5121d0ad23957311532da2f1'), m: "Double" }, p: ObjectId('5152834b77232a55b1c298b6'), v: ObjectId('5152836577232a55b1c298b9') } } Fri Jan 17 19:32:00.761 [repl writer worker 1] Fatal Assertion 16360 0x10044c60b 0x100425837 0x10033c97f 0x10042cc48 0x10047f1a5 0x7fff8e8bb899 0x7fff8e8bb72a 0x7fff8e8bffc9 0 mongod 0x000000010044c60b _ZN5mongo15printStackTraceERSo + 43 1 mongod 0x0000000100425837 _ZN5mongo13fassertFailedEi + 151 2 mongod 0x000000010033c97f _ZN5mongo7replset14multiSyncApplyERKSt6vectorINS_7BSONObjESaIS2_EEPNS0_8SyncTailE + 271 3 mongod 0x000000010042cc48 _ZN5mongo10threadpool6Worker4loopEv + 138 4 mongod 0x000000010047f1a5 thread_proxy + 229 5 libsystem_pthread.dylib 0x00007fff8e8bb899 _pthread_body + 138 6 libsystem_pthread.dylib 0x00007fff8e8bb72a _pthread_struct_init + 0 7 libsystem_pthread.dylib 0x00007fff8e8bffc9 thread_start + 13 Fri Jan 17 19:32:00.763 [repl writer worker 1] ***aborting after fassert() failure Fri Jan 17 19:32:00.763 Got signal: 6 (Abort trap: 6). Fri Jan 17 19:32:00.765 Backtrace: 0x10044c60b 0x100001121 0x7fff936dc5aa 0x5 0x7fff8e978bba 0x100425875 0x10033c97f 0x10042cc48 0x10047f1a5 0x7fff8e8bb899 0x7fff8e8bb72a 0x7fff8e8bffc9 0 mongod 0x000000010044c60b _ZN5mongo15printStackTraceERSo + 43 1 mongod 0x0000000100001121 _ZN5mongo10abruptQuitEi + 225 2 libsystem_platform.dylib 0x00007fff936dc5aa _sigtramp + 26 3 ??? 0x0000000000000005 0x0 + 5 4 libsystem_c.dylib 0x00007fff8e978bba abort + 125 5 mongod 0x0000000100425875 _ZN5mongo13fassertFailedEi + 213 6 mongod 0x000000010033c97f _ZN5mongo7replset14multiSyncApplyERKSt6vectorINS_7BSONObjESaIS2_EEPNS0_8SyncTailE + 271 7 mongod 0x000000010042cc48 _ZN5mongo10threadpool6Worker4loopEv + 138 8 mongod 0x000000010047f1a5 thread_proxy + 229 9 libsystem_pthread.dylib 0x00007fff8e8bb899 _pthread_body + 138 10 libsystem_pthread.dylib 0x00007fff8e8bb72a _pthread_struct_init + 0 11 libsystem_pthread.dylib 0x00007fff8e8bffc9 thread_start + 13
On subsequent restart, it applies the same insert without complaint:
Fri Jan 17 19:33:51.832 [rsStart] replSet STARTUP2 Fri Jan 17 19:33:51.832 BackgroundJob starting: rsMgr Fri Jan 17 19:33:51.833 [rsBackgroundSync] replset bgsync fetch queue set to: 52d9ef1f:1 -540961654815979307 Fri Jan 17 19:33:51.833 BackgroundJob starting: rsGhostSync Fri Jan 17 19:33:51.833 [rsSyncNotifier] replset markOplog: 0:0 52d9ef1f:1 Fri Jan 17 19:33:51.833 BackgroundJob starting: ConnectBG Fri Jan 17 19:33:51.834 [rsHealthPoll] replset info asyasmacbook.local:40001 thinks that we are down Fri Jan 17 19:33:51.834 [rsHealthPoll] replSet member asyasmacbook.local:40001 is up Fri Jan 17 19:33:51.834 [rsHealthPoll] replSet member asyasmacbook.local:40001 is now in state PRIMARY Fri Jan 17 19:33:52.833 [rsSync] replSet still syncing, not yet to minValid optime 52d9f5b0:1 Fri Jan 17 19:33:52.833 [rsSyncNotifier] replset markOplog: 0:0 52d9ef1f:1 Fri Jan 17 19:33:53.429 [initandlisten] connection accepted from 127.0.0.1:58786 #1 (1 connection now open) Fri Jan 17 19:33:53.429 [initandlisten] connection accepted from 127.0.0.1:58787 #2 (2 connections now open) Fri Jan 17 19:33:53.834 [rsSyncNotifier] replset markOplog: 0:0 52d9ef1f:1 Fri Jan 17 19:33:54.835 [rsSyncNotifier] replset markOplog: 0:0 52d9ef1f:1 Fri Jan 17 19:33:55.835 [rsSyncNotifier] replset markOplog: 0:0 52d9ef1f:1 Fri Jan 17 19:33:56.836 [rsSyncNotifier] replset markOplog: 0:0 52d9ef1f:1 Fri Jan 17 19:33:57.837 [rsSyncNotifier] replset markOplog: 0:0 52d9ef1f:1 Fri Jan 17 19:33:58.837 [rsBackgroundSync] replSet syncing to: asyasmacbook.local:40001 Fri Jan 17 19:33:58.838 [rsSyncNotifier] replset markOplog: 0:0 52d9ef1f:1 Fri Jan 17 19:33:58.838 [rsSync] replSet still syncing, not yet to minValid optime 52d9f5b0:1 Fri Jan 17 19:33:58.838 BackgroundJob starting: ConnectBG Fri Jan 17 19:33:58.841 [rsBackgroundSync] replSet remoteOldestOp: Jan 8 12:57:48 52cdbbcc:19 Fri Jan 17 19:33:58.841 [rsBackgroundSync] repl: local.oplog.rs.find({ ts: { $gte: Timestamp 1390014239000|1 } }) Fri Jan 17 19:33:58.841 [repl prefetch worker] opening db: cs Fri Jan 17 19:33:58.841 [repl prefetch worker] mmf open /data/rs2/cs.ns Fri Jan 17 19:33:58.842 [repl prefetch worker] mmf finishOpening 0x1336e7000 /data/rs2/cs.ns len:16777216 Fri Jan 17 19:33:58.842 [repl prefetch worker] mmf open /data/rs2/cs.0 Fri Jan 17 19:33:58.842 [repl prefetch worker] mmf finishOpening 0x1356e7000 /data/rs2/cs.0 len:67108864 Fri Jan 17 19:33:58.842 [repl prefetch worker] mmf open /data/rs2/cs.1 Fri Jan 17 19:33:58.842 [repl prefetch worker] mmf finishOpening 0x13d6e7000 /data/rs2/cs.1 len:134217728 Fri Jan 17 19:33:58.842 [repl prefetch worker] mmf close /data/rs2/cs.1 Fri Jan 17 19:33:58.842 [repl prefetch worker] _groupCommit Fri Jan 17 19:33:58.844 [repl prefetch worker] groupCommit end Fri Jan 17 19:33:58.845 [repl prefetch worker] index prefetch for op i Fri Jan 17 19:33:58.845 [rsSync] replication batch size is 1 Fri Jan 17 19:33:58.845 [repl writer worker 1] applying op: { ts: Timestamp 1390015920000|1, h: 3574278741816169728, v: 2, op: "i", ns: "cs.csku", o: { _id: { r: ObjectId('5121d0ad23957311532da2f1'), m: "Double" }, p: ObjectId('5152834b77232a55b1c298b6'), v: ObjectId('5152836577232a55b1c298b9') } } Fri Jan 17 19:33:59.839 [rsSyncNotifier] replset markOplog: 0:0 52d9f5b0:1 Fri Jan 17 19:33:59.839 [rsSyncNotifier] replset setting oplog notifier to asyasmacbook.local:40001 Fri Jan 17 19:33:59.840 BackgroundJob starting: ConnectBG Fri Jan 17 19:33:59.840 [rsSyncNotifier] replHandshake res not: 1 res: { ok: 1.0 } Fri Jan 17 19:33:59.841 [rsSyncNotifier] repl: local.oplog.rs.find({ ts: { $gte: Timestamp 1390015920000|1 } }) Fri Jan 17 19:33:59.846 [rsSync] replSet SECONDARY
Now secondary has two documents with the same value in violation of a unique constraint.