[SERVER-12407] RECOVERING nodes may insert documents that violate unique indexes Created: 18/Jan/14  Updated: 11/Feb/15  Resolved: 21/Jan/14

Status: Closed
Project: Core Server
Component/s: Index Maintenance, Replication
Affects Version/s: 2.4.8
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Asya Kamsky Assignee: Unassigned
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
Operating System: ALL
Participants:

 Description   

As a result of SERVER-12406 (or some other way) a secondary has a document that a primary does not have. Now an insert that's legal on the primary may violate a unique constraint on the secondary.

Observed behavior is that this insert causes the secondary to crash:

Fri Jan 17 19:32:00.760 [repl prefetch worker] index prefetch for op i
Fri Jan 17 19:32:00.760 [rsSync] replication batch size is 1
Fri Jan 17 19:32:00.760 [repl writer worker 1] applying op: { ts: Timestamp 1390015920000|1, h: 3574278741816169728, v: 2, op: "i", ns: "cs.csku", o: { _id: { r: ObjectId('5121d0ad23957311532da2f1'), m: "Double" }, p: ObjectId('5152834b77232a55b1c298b6'), v: ObjectId('5152836577232a55b1c298b9') } }
Fri Jan 17 19:32:00.760 [repl writer worker 1] User Assertion: 11000:E11000 duplicate key error index: cs.csku.$v_1_p_1  dup key: { : ObjectId('5152836577232a55b1c298b9'), : ObjectId('5152834b77232a55b1c298b6') }
Fri Jan 17 19:32:00.760 [repl writer worker 1] ERROR: writer worker caught exception: E11000 duplicate key error index: cs.csku.$v_1_p_1  dup key: { : ObjectId('5152836577232a55b1c298b9'), : ObjectId('5152834b77232a55b1c298b6') } on: { ts: Timestamp 1390015920000|1, h: 3574278741816169728, v: 2, op: "i", ns: "cs.csku", o: { _id: { r: ObjectId('5121d0ad23957311532da2f1'), m: "Double" }, p: ObjectId('5152834b77232a55b1c298b6'), v: ObjectId('5152836577232a55b1c298b9') } }
Fri Jan 17 19:32:00.761 [repl writer worker 1]   Fatal Assertion 16360
0x10044c60b 0x100425837 0x10033c97f 0x10042cc48 0x10047f1a5 0x7fff8e8bb899 0x7fff8e8bb72a 0x7fff8e8bffc9
 0   mongod                              0x000000010044c60b _ZN5mongo15printStackTraceERSo + 43
 1   mongod                              0x0000000100425837 _ZN5mongo13fassertFailedEi + 151
 2   mongod                              0x000000010033c97f _ZN5mongo7replset14multiSyncApplyERKSt6vectorINS_7BSONObjESaIS2_EEPNS0_8SyncTailE + 271
 3   mongod                              0x000000010042cc48 _ZN5mongo10threadpool6Worker4loopEv + 138
 4   mongod                              0x000000010047f1a5 thread_proxy + 229
 5   libsystem_pthread.dylib             0x00007fff8e8bb899 _pthread_body + 138
 6   libsystem_pthread.dylib             0x00007fff8e8bb72a _pthread_struct_init + 0
 7   libsystem_pthread.dylib             0x00007fff8e8bffc9 thread_start + 13
Fri Jan 17 19:32:00.763 [repl writer worker 1]
 
***aborting after fassert() failure
 
 
Fri Jan 17 19:32:00.763 Got signal: 6 (Abort trap: 6).
 
Fri Jan 17 19:32:00.765 Backtrace:
0x10044c60b 0x100001121 0x7fff936dc5aa 0x5 0x7fff8e978bba 0x100425875 0x10033c97f 0x10042cc48 0x10047f1a5 0x7fff8e8bb899 0x7fff8e8bb72a 0x7fff8e8bffc9
 0   mongod                              0x000000010044c60b _ZN5mongo15printStackTraceERSo + 43
 1   mongod                              0x0000000100001121 _ZN5mongo10abruptQuitEi + 225
 2   libsystem_platform.dylib            0x00007fff936dc5aa _sigtramp + 26
 3   ???                                 0x0000000000000005 0x0 + 5
 4   libsystem_c.dylib                   0x00007fff8e978bba abort + 125
 5   mongod                              0x0000000100425875 _ZN5mongo13fassertFailedEi + 213
 6   mongod                              0x000000010033c97f _ZN5mongo7replset14multiSyncApplyERKSt6vectorINS_7BSONObjESaIS2_EEPNS0_8SyncTailE + 271
 7   mongod                              0x000000010042cc48 _ZN5mongo10threadpool6Worker4loopEv + 138
 8   mongod                              0x000000010047f1a5 thread_proxy + 229
 9   libsystem_pthread.dylib             0x00007fff8e8bb899 _pthread_body + 138
 10  libsystem_pthread.dylib             0x00007fff8e8bb72a _pthread_struct_init + 0
 11  libsystem_pthread.dylib             0x00007fff8e8bffc9 thread_start + 13

On subsequent restart, it applies the same insert without complaint:

Fri Jan 17 19:33:51.832 [rsStart] replSet STARTUP2
Fri Jan 17 19:33:51.832 BackgroundJob starting: rsMgr
Fri Jan 17 19:33:51.833 [rsBackgroundSync] replset bgsync fetch queue set to: 52d9ef1f:1 -540961654815979307
Fri Jan 17 19:33:51.833 BackgroundJob starting: rsGhostSync
Fri Jan 17 19:33:51.833 [rsSyncNotifier] replset markOplog: 0:0 52d9ef1f:1
Fri Jan 17 19:33:51.833 BackgroundJob starting: ConnectBG
Fri Jan 17 19:33:51.834 [rsHealthPoll] replset info asyasmacbook.local:40001 thinks that we are down
Fri Jan 17 19:33:51.834 [rsHealthPoll] replSet member asyasmacbook.local:40001 is up
Fri Jan 17 19:33:51.834 [rsHealthPoll] replSet member asyasmacbook.local:40001 is now in state PRIMARY
Fri Jan 17 19:33:52.833 [rsSync] replSet still syncing, not yet to minValid optime 52d9f5b0:1
Fri Jan 17 19:33:52.833 [rsSyncNotifier] replset markOplog: 0:0 52d9ef1f:1
Fri Jan 17 19:33:53.429 [initandlisten] connection accepted from 127.0.0.1:58786 #1 (1 connection now open)
Fri Jan 17 19:33:53.429 [initandlisten] connection accepted from 127.0.0.1:58787 #2 (2 connections now open)
Fri Jan 17 19:33:53.834 [rsSyncNotifier] replset markOplog: 0:0 52d9ef1f:1
Fri Jan 17 19:33:54.835 [rsSyncNotifier] replset markOplog: 0:0 52d9ef1f:1
Fri Jan 17 19:33:55.835 [rsSyncNotifier] replset markOplog: 0:0 52d9ef1f:1
Fri Jan 17 19:33:56.836 [rsSyncNotifier] replset markOplog: 0:0 52d9ef1f:1
Fri Jan 17 19:33:57.837 [rsSyncNotifier] replset markOplog: 0:0 52d9ef1f:1
Fri Jan 17 19:33:58.837 [rsBackgroundSync] replSet syncing to: asyasmacbook.local:40001
Fri Jan 17 19:33:58.838 [rsSyncNotifier] replset markOplog: 0:0 52d9ef1f:1
Fri Jan 17 19:33:58.838 [rsSync] replSet still syncing, not yet to minValid optime 52d9f5b0:1
Fri Jan 17 19:33:58.838 BackgroundJob starting: ConnectBG
Fri Jan 17 19:33:58.841 [rsBackgroundSync] replSet remoteOldestOp: Jan  8 12:57:48 52cdbbcc:19
Fri Jan 17 19:33:58.841 [rsBackgroundSync] repl: local.oplog.rs.find({ ts: { $gte: Timestamp 1390014239000|1 } })
Fri Jan 17 19:33:58.841 [repl prefetch worker] opening db:  cs
Fri Jan 17 19:33:58.841 [repl prefetch worker] mmf open /data/rs2/cs.ns
Fri Jan 17 19:33:58.842 [repl prefetch worker] mmf finishOpening 0x1336e7000 /data/rs2/cs.ns len:16777216
Fri Jan 17 19:33:58.842 [repl prefetch worker] mmf open /data/rs2/cs.0
Fri Jan 17 19:33:58.842 [repl prefetch worker] mmf finishOpening 0x1356e7000 /data/rs2/cs.0 len:67108864
Fri Jan 17 19:33:58.842 [repl prefetch worker] mmf open /data/rs2/cs.1
Fri Jan 17 19:33:58.842 [repl prefetch worker] mmf finishOpening 0x13d6e7000 /data/rs2/cs.1 len:134217728
Fri Jan 17 19:33:58.842 [repl prefetch worker] mmf close /data/rs2/cs.1
Fri Jan 17 19:33:58.842 [repl prefetch worker] _groupCommit
Fri Jan 17 19:33:58.844 [repl prefetch worker] groupCommit end
Fri Jan 17 19:33:58.845 [repl prefetch worker] index prefetch for op i
Fri Jan 17 19:33:58.845 [rsSync] replication batch size is 1
Fri Jan 17 19:33:58.845 [repl writer worker 1] applying op: { ts: Timestamp 1390015920000|1, h: 3574278741816169728, v: 2, op: "i", ns: "cs.csku", o: { _id: { r: ObjectId('5121d0ad23957311532da2f1'), m: "Double" }, p: ObjectId('5152834b77232a55b1c298b6'), v: ObjectId('5152836577232a55b1c298b9') } }
Fri Jan 17 19:33:59.839 [rsSyncNotifier] replset markOplog: 0:0 52d9f5b0:1
Fri Jan 17 19:33:59.839 [rsSyncNotifier] replset setting oplog notifier to asyasmacbook.local:40001
Fri Jan 17 19:33:59.840 BackgroundJob starting: ConnectBG
Fri Jan 17 19:33:59.840 [rsSyncNotifier] replHandshake res not: 1 res: { ok: 1.0 }
Fri Jan 17 19:33:59.841 [rsSyncNotifier] repl: local.oplog.rs.find({ ts: { $gte: Timestamp 1390015920000|1 } })
Fri Jan 17 19:33:59.846 [rsSync] replSet SECONDARY

Now secondary has two documents with the same value in violation of a unique constraint.



 Comments   
Comment by Eric Milkie [ 21/Jan/14 ]

The logic for idempotency in replication depends on the primary's logic being correct. It also depends on the storage subsystem for identifying documents by _id. If either if these constraints are broken, then the behavior will not be correct. Definitively detecting all violations is not easy.
When a replica set node is in the recovery phase, documents must be allowed to violate unique indexes temporarily, due to idempotency issues. By the time the node transitions to a SECONDARY state, the unique index will no longer be violated.

Generated at Thu Feb 08 03:28:28 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.