[SERVER-19618] Allow applyOps to ignore unique index constraints Created: 28/Jul/15  Updated: 26/Apr/19  Resolved: 29/Jul/15

Status: Closed
Project: Core Server
Component/s: Write Ops
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Kevin Pulo Assignee: Unassigned
Resolution: Won't Fix Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Documented
is documented by DOCS-8967 Allow applyOps to ignore unique index... Closed
Related
related to SERVER-12508 Add Replica Set Restore Mode (for poi... Closed
related to TOOLS-176 Dump/Restore with --oplog not point-i... Closed
related to SERVER-10773 Provide mechanism for cloud to reliab... Closed
Backwards Compatibility: Major Change
Participants:

 Description   

Since unique index constraints are relaxed in other places where batches of operations are applied (atomically), eg. initial sync, recovering, rollback, etc; they should also be relaxed when applying a batch of operations given to applyOps.

This means that the following should succeed:

var t = db.test;
t.drop();
t.ensureIndex( { a: 1}, { unique: true } );
t.insert( { _id : 1, a : 2 } );
t.insert( { _id : 2, a : 1 } );
var op1 = { "ts" : Timestamp( 1438051519, 1 ), "h" : NumberLong(-4881641317324516364), "v" : 2, "op" : "i", "ns" : "test.test", "o" : { "_id" : 1, "a" : 1 } };
var op2 = { "ts" : Timestamp( 1438051521, 1 ), "h" : NumberLong(6481857659867173805), "v" : 2, "op" : "u", "ns" : "test.test", "o2" : { "_id" : 1 }, "o" : { "$set" : { "a" : 2 } } };
var op3 = { "ts" : Timestamp( 1438051524, 1 ), "h" : NumberLong(4061210955299695112), "v" : 2, "op" : "i", "ns" : "test.test", "o" : { "_id" : 2, "a" : 1 } };
t.getDB().runCommand( { applyOps: [ op1, op2, op3 ] } );

Current behaviour is that it fails on the first operation:

> t.getDB().runCommand( { applyOps: [ op1, op2, op3 ] } );
{
        "errmsg" : "exception: E11000 duplicate key error index: test.test.$a_1 dup key: { : 1.0 }",
        "code" : 11000,
        "ok" : 0
}

If there are still duplicates in the db at the end of the batch (for example, by passing just op1 in the command above), then ideally this should be flagged somehow (since the db has entered an erroneous state).

The best case would be for the effects of the entire batch to be undone, and the applyOps command to fail. Unfortunately the problem is that by time the end of the batch has been reached, the operations have already been applied and can't easily be undone. This means that actually rolling back is likely to be impossible.

This may not be a problem if subsequent applyOps batches are going to correct the problem (and there will be no other intervening writes). Since this is not certain to occur it would still probably be best to issue a warning or (different) error message to alert the user of the situation (ie. that there are now duplicates in the db).



 Comments   
Comment by Ian Whalen (Inactive) [ 26/Apr/19 ]

Switching "Drivers Changes Needed" from "Maybe" to "Not Needed" since this was closed as something other than Fixed.

Comment by Scott Hernandez (Inactive) [ 28/Jul/15 ]

The reason it is safe to do ignore unique index constraints in replication is due to the fact that replication ensures that a consistent state is reached and until then the member cannot be used, and those states are not visible – basically it allows internal, and invalid, intermediate states which are not valid to stop at, or expose externally.

If this allowed corrupting data, and I'm not sure how to interpret the state of allowing non-unique data in a unique index any other way, there is no way to stop users from seeing it, or ensure that it gets fixed – in fact parts of the server cannot deal with this illegal state which will cause the system to shutdown (by fassert'n) when encountered.

Generated at Thu Feb 08 03:51:33 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.