[SERVER-31060] Two phase drops with too long MMAPv1 index names must generate dropIndexes operations before drop operation Created: 12/Sep/17  Updated: 29/Jan/18  Resolved: 02/Oct/17

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: William Schultz (Inactive) Assignee: Benety Goh
Resolution: Won't Fix Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-31351 rolling back a collection drop with l... Closed
related to SERVER-29747 Two phase drops: drop indexes before ... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: Repl 2017-10-02, Repl 2017-10-23
Participants:

 Description   

Collection drops are two phase, so a collection is renamed to a temporary namespace before it is physically dropped. On MMAP, there is a hard limit on namespace lengths, so we will drop any indexes with names that would be too long following the drop collection rename (SERVER-29747). Currently, we are physically dropping the offending indexes and then renaming the dropped collection, however, we are generating oplog entries in the reverse order. That is, we actually log the 'drop' collection operation first, and then any 'dropIndexes' operations.

In theory, this should break two phase drop behavior on MMAP when there are offending indexes (see jstests/replsets/drop_collections_two_phase_long_index_names.js), since, when secondaries try to apply the operations, they apply the drop collection operation first, attempting a rename of a collection that would create index names that are too long, resulting in a fatal assertion. It seems, that, inadvertently, we avoided this case by also not properly checking writesAreReplicated in the dropCollectionEvenIfSystem function. On primary, we should perform the index drops and collection rename, but the secondaries should simply apply the drop indexes oplog operations as normal operations and then apply the drop collection op.

In general, it seems incorrect to be logging "non-sensical" oplog entries i.e. a dropIndex on a collection that no longer exists. This may manifest as a bug when trying to rollback a drop collection operation that drops indexes in this manner, since you would be rolling back a dropIndex on a collection that doesn't exist. However, it might also be avoided there since rollback via refetch takes some freedom in re-ordering operations that it is reverting.

Dumped primary oplog from run of drop_collections_two_phase_long_index_names.js below. You can see the 'dropIndexes' operation appear after the 'drop' operation on collection uuid '932ed914-d0d5-44ce-bba7-5c4a2b05adf0':

Dumping the latest 20 documents that match { } from the oplog oplog.rs of williams-ubuntu:20010
{  "ts" : Timestamp(1505236565, 4),  "t" : NumberLong(1),  "h" : NumberLong("-5285471193402663181"),  "v" : 2,  "op" : "c",  "ns" : "drop_collection_two_phase_long_index_names.$cmd",  "ui" : UUID("932ed914-d0d5-44ce-bba7-5c4a2b05adf0"),  "o2" : {  "v" : 2,  "key" : {  "a" : 1 },  "name" : "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",  "ns" : "drop_collection_two_phase_long_index_names.collToD
rop" },  "wall" : ISODate("2017-09-12T17:16:05.374Z"),  "o" : {  "dropIndexes" : "collToDrop",  "index" : "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa" } }
{  "ts" : Timestamp(1505236565, 3),  "t" : NumberLong(1),  "h" : NumberLong("2127366147023087314"),  "v" : 2,  "op" : "c",  "ns" : "drop_collection_two_phase_long_index_names.$cmd",  "ui" : UUID("932ed914-d0d5-44ce-bba7-5c4a2b05adf0"),  "wall" : ISODate("2017-09-12T17:16:05.373Z"),  "o" : {  "drop" : "collToDrop" } }
{  "ts" : Timestamp(1505236565, 2),  "t" : NumberLong(1),  "h" : NumberLong("2818049393288337882"),  "v" : 2,  "op" : "c",  "ns" : "drop_collection_two_phase_long_index_names.$cmd",  "ui" : UUID("932ed914-d0d5-44ce-bba7-5c4a2b05adf0"),  "wall" : ISODate("2017-09-12T17:16:05.040Z"),  "o" : {  "createIndexes" : "collToDrop",  "v" : 2,  "key" : {  "b" : 1 },  "name" : "short_name" } }
{  "ts" : Timestamp(1505236565, 1),  "t" : NumberLong(1),  "h" : NumberLong("-6587097313745957566"),  "v" : 2,  "op" : "c",  "ns" : "drop_collection_two_phase_long_index_names.$cmd",  "ui" : UUID("932ed914-d0d5-44ce-bba7-5c4a2b05adf0"),  "wall" : ISODate("2017-09-12T17:16:05.033Z"),  "o" : {  "createIndexes" : "collToDrop",  "v" : 2,  "key" : {  "a" : 1 },  "name" : "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaa" } }
{  "ts" : Timestamp(1505236559, 1),  "t" : NumberLong(1),  "h" : NumberLong("4104972344915778556"),  "v" : 2,  "op" : "c",  "ns" : "drop_collection_two_phase_long_index_names.$cmd",  "ui" : UUID("932ed914-d0d5-44ce-bba7-5c4a2b05adf0"),  "wall" : ISODate("2017-09-12T17:15:59.080Z"),  "o" : {  "create" : "collToDrop",  "idIndex" : {  "v" : 2,  "key" : {  "_id" : 1 },  "name" : "_id_",  "ns" : "drop_collection_two_phase_long
_index_names.collToDrop" } } }
{  "ts" : Timestamp(1505236557, 2),  "t" : NumberLong(1),  "h" : NumberLong("8357852249975499374"),  "v" : 2,  "op" : "n",  "ns" : "",  "wall" : ISODate("2017-09-12T17:15:57.268Z"),  "o" : {  "msg" : "Reconfig set",  "version" : 2 } }
{  "ts" : Timestamp(1505236557, 1),  "t" : NumberLong(1),  "h" : NumberLong("-647256883118635236"),  "v" : 2,  "op" : "c",  "ns" : "config.$cmd",  "ui" : UUID("91cd72bd-d995-456f-a1b4-6a97d0d440e7"),  "wall" : ISODate("2017-09-12T17:15:57.187Z"),  "o" : {  "create" : "transactions",  "idIndex" : {  "v" : 2,  "key" : {  "_id" : 1 },  "name" : "_id_",  "ns" : "config.transactions" } } }
{  "ts" : Timestamp(1505236556, 1),  "t" : NumberLong(1),  "h" : NumberLong("5993574424827143666"),  "v" : 2,  "op" : "n",  "ns" : "",  "wall" : ISODate("2017-09-12T17:15:56.913Z"),  "o" : {  "msg" : "new primary" } }
{  "ts" : Timestamp(1505236554, 1),  "h" : NumberLong("1918102913157968804"),  "v" : 2,  "op" : "n",  "ns" : "",  "o" : {  "msg" : "initiating set" } }



 Comments   
Comment by Benety Goh [ 02/Oct/17 ]

Closing as 'Won't Fix'. See follow up work in SERVER-31351.

Comment by Benety Goh [ 02/Oct/17 ]

Upon further discussion, there's still an issue with mixed clusters with drop collections. This is described in SERVER-31351.

Comment by Benety Goh [ 21/Sep/17 ]

This should not cause any issues for applications that apply the oplog entries in the order they are generated (e.g. mongorestore --oplogReplay). The out-of-order dropIndexes entries would be redundant.

The only issue I see is if we somehow neglect to replicate the dropIndexes to a secondary node and then get into a rollback situation where the drop collection is reverted. In this scenario, we could fail to restore the indexes (under MMAPv1 only) with the long index names. I'm not sure how it's possible to get into this situation since the entries for the collection and index drops are all written in the same WriteUnitOfWork.

Fixing the ordering of these drop and dropIndexes entries would be nice.

Comment by Spencer Brody (Inactive) [ 12/Sep/17 ]

Assigning to Benety to investigate the scope of this issue and whether it can cause an actual bug.

Generated at Thu Feb 08 04:25:52 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.