[SERVER-25522] Hashing of _id during application of oplog entries must respect the collation Created: 09/Aug/16  Updated: 13/Aug/16  Resolved: 11/Aug/16

Status: Closed
Project: Core Server
Component/s: Querying, Replication
Affects Version/s: None
Fix Version/s: 3.3.11

Type: Task Priority: Major - P3
Reporter: David Storch Assignee: David Storch
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Backwards Compatibility: Fully Compatible
Sprint: Query 2016-08-29
Participants:

 Description   

Suppose that a user issues the following operations:

db.createCollection("c", {collation: {locale: "en_US", strength: 2}});
db.c.insert({_id: "foo"});
db.c.remove({_id: "FOO"});

Here the collection default collation is case-insensitive, so the expected state of the collection after these commands are issued is to be empty. The replication system, however, may distribute oplog application work on the secondary based on the hash of the _id. This hash is currently not collation-aware. As a consequence, the two operations may get assigned to different applier workers and could get applied in the wrong order. The result would be data corruption, since the primary would have no documents, but the secondary would have the document {_id: "foo"}.

The simplest fix is to change fillWriterVectors() to assign all oplog entries where the collection has a non-simple collation to the same vector.



 Comments   
Comment by Githook User [ 11/Aug/16 ]

Author:

{u'username': u'dstorch', u'name': u'David Storch', u'email': u'david.storch@10gen.com'}

Message: SERVER-25522 ensure oplog entries applied serially to collection with non-simple collation

This restriction can be relaxed pending a mechanism for
collation-aware hashing of BSONElement.
Branch: master
https://github.com/mongodb/mongo/commit/afd98273f7a2d09a4b6ce29950d73f56dfaf3fc4

Generated at Thu Feb 08 04:09:25 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.