[SERVER-35804] Disallow dropping collections under config/admin via mongos Created: 26/Jun/18  Updated: 29/Oct/23  Resolved: 15/Apr/20

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: 4.7.0

Type: Bug Priority: Major - P3
Reporter: Xiangyu Yao (Inactive) Assignee: Randolph Tan
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
is depended on by SERVER-35828 Check the readSource in dropCollection Closed
Documented
is documented by DOCS-13595 Investigate changes in SERVER-35804: ... Closed
Problem/Incident
causes SERVER-54977 Clarify that admin database collectio... Closed
Related
related to SERVER-47896 blacklist basic_drop_coll.js and drop... Closed
is related to SERVER-47550 Complete TODO listed in SERVER-35504 Backlog
Backwards Compatibility: Minor Change
Operating System: ALL
Steps To Reproduce:

See the linked BF.

Sprint: Sharding 2020-04-20
Participants:
Linked BF Score: 45

 Description   

This ticket makes it such that attempting to drop collections under the admin or config database through mongos will now throw an error. Attempting to call dropDatabase on admin or config will similarly throw an error. It is still possible to drop these collections when connecting directly to the config server.

Original Description:

_configsvrDropCollection calls lockWithSessionID() which sets the readSource to be kMajorityCommitted and all the following commands from _configsvrDropCollection would erroneously keep using kMajorityCommitted as the readSource.

Here is what happened in BF-9590:
1. The config primary server got _configsvrDropCollection command.
2. The distributed lock manager ran lockWithSessionID which triggered a majority read, so the readSource of the WiredTigerRecoveryUnit was set to 'majority'.
3. There was another thread which just finished adding an index to the collection "admin.mod1"
4. _configsvrDropCollection finally called dropCollection which did a check whether the numbers of indexes on disk and in memory were equal. Since the readSource was still "majority", we read the on-disk catalog with "majority" timestamp when the change to index catalog was not made yet. But the in-memory index catalog has the new index.

Update:
_configsvrDropCollection() finally called dropCollection() because the collection is in admin database whose primary shard is the config server itself.



 Comments   
Comment by Githook User [ 14/Apr/20 ]

Author:

{'name': 'Randolph Tan', 'email': 'randolph@10gen.com', 'username': 'renctan'}

Message: SERVER-35804 Disallow dropping config and admin from mongos
Branch: master
https://github.com/mongodb/mongo/commit/eb939801412d100b3d0a09ab9dacfc1c64694395

Comment by Dianna Hohensee (Inactive) [ 26/Jun/18 ]

So this is happening because the fuzzer is targeting admin.mod1 in the _configsvrDropCollection command, which it previously created on the config server.

_configsvrDropCollection is running, before createCollection, but apparently it takes a long time to get distlocks

[ShardedClusterFixture:job0:configsvr:primary] 2018-06-16T17:26:12.886+0000 D TRACKING [conn247] Cmd: _configsvrDropCollection, TrackingId: 5b25483476322dfa8d75a72b|5b25483476322dfa8d75a72c

createCollection admin.mod1

[ShardedClusterFixture:job0:configsvr:primary] 2018-06-16T17:26:13.388+0000 I STORAGE  [conn277] createCollection: admin.mod1 with generated UUID: f07cea58-baac-4610-9b08-574eea48e773

getting the distlocks for admin.mod1 and then hitting the invariant

[ShardedClusterFixture:job0:configsvr:primary] 2018-06-16T17:26:13.392+0000 D SHARDING [conn247] trying to acquire new distributed lock for admin ( lock timeout : 900000 ms, ping interval : 30000 ms, process : ConfigServer ) with lockSessionID: 5b2548344bf4c0b03f3097db, why: dropCollection
[ShardedClusterFixture:job0:configsvr:primary] 2018-06-16T17:26:13.396+0000 I COMMAND  [conn247] command config.locks appName: "MongoDB Shell" command: findAndModify { findAndModify: "locks", query: { _id: "admin", state: 0 }, update: { $set: { ts: ObjectId('5b2548344bf4c0b03f3097db'), state: 2, who: "ConfigServer:conn247", process: "ConfigServer", when: new Date(1529169973392), why: "dropCollection" } }, upsert: true, new: true, writeConcern: { w: "majority", wtimeout: 15000 }, $db: "config" } planSummary: IXSCAN { _id: 1 } keysExamined:1 docsExamined:1 nMatched:1 nModified:1 keysInserted:2 keysDeleted:2 numYields:0 reslen:467 locks:{ Global: { acquireCount: { r: 7, w: 3 } }, Database: { acquireCount: { r: 2, w: 3 } }, Collection: { acquireCount: { r: 2, w: 2 } }, oplog: { acquireCount: { w: 1 } } } protocol:op_msg 3ms
[ShardedClusterFixture:job0:configsvr:primary] 2018-06-16T17:26:13.396+0000 I SHARDING [conn247] distributed lock 'admin' acquired for 'dropCollection', ts : 5b2548344bf4c0b03f3097db
[ShardedClusterFixture:job0:configsvr:primary] 2018-06-16T17:26:13.396+0000 D SHARDING [conn247] trying to acquire new distributed lock for admin.mod1 ( lock timeout : 900000 ms, ping interval : 30000 ms, process : ConfigServer ) with lockSessionID: 5b2548354bf4c0b03f30986c, why: dropCollection
[ShardedClusterFixture:job0:configsvr:primary] 2018-06-16T17:26:13.398+0000 I COMMAND  [conn247] command config.locks appName: "MongoDB Shell" command: findAndModify { findAndModify: "locks", query: { _id: "admin.mod1", state: 0 }, update: { $set: { ts: ObjectId('5b2548354bf4c0b03f30986c'), state: 2, who: "ConfigServer:conn247", process: "ConfigServer", when: new Date(1529169973396), why: "dropCollection" } }, upsert: true, new: true, writeConcern: { w: "majority", wtimeout: 15000 }, $db: "config" } planSummary: IXSCAN { _id: 1 } keysExamined:1 docsExamined:1 nMatched:1 nModified:1 keysInserted:2 keysDeleted:2 numYields:0 reslen:472 locks:{ Global: { acquireCount: { r: 9, w: 5 } }, Database: { acquireCount: { r: 2, w: 5 } }, Collection: { acquireCount: { r: 2, w: 3 } }, oplog: { acquireCount: { w: 2 } } } protocol:op_msg 1ms
[ShardedClusterFixture:job0:configsvr:primary] 2018-06-16T17:26:13.398+0000 I SHARDING [conn247] distributed lock 'admin.mod1' acquired for 'dropCollection', ts : 5b2548354bf4c0b03f30986c
[ShardedClusterFixture:job0:configsvr:primary] 2018-06-16T17:26:13.398+0000 I COMMAND  [conn247] command config.collections appName: "MongoDB Shell" command: find { find: "collections", filter: { _id: "admin.mod1" }, ntoreturn: 1, singleBatch: true, $readPreference: { mode: "nearest", tags: [] }, $db: "config" } planSummary: IDHACK keysExamined:0 docsExamined:0 cursorExhausted:1 numYields:0 nreturned:0 reslen:341 locks:{ Global: { acquireCount: { r: 10, w: 5 } }, Database: { acquireCount: { r: 3, w: 5 } }, Collection: { acquireCount: { r: 3, w: 3 } }, oplog: { acquireCount: { w: 2 } } } protocol:op_msg 0ms
[ShardedClusterFixture:job0:configsvr:primary] 2018-06-16T17:26:13.398+0000 I COMMAND  [conn247] CMD: drop admin.mod1
[ShardedClusterFixture:job0:configsvr:primary] 2018-06-16T17:26:13.399+0000 F -        [conn247] Invariant failure _collection->getCatalogEntry()->getTotalIndexCount(opCtx) == count src/mongo/db/catalog/index_catalog_impl.cpp 1116

So the fuzzer threw all the basic sanity assumptions out the window and this is really hilarious.

RecoveryUnit::setTimestampReadSource can be used to reset the readConcern majority effect on the RecoveryUnit after acquiring the distlock – should use kUnset, which should let the dropCommand run without the weird majority readConcern setting. Storage is probably going to make dropCollection error on weird readConcern settings for future ease, but _configsrvDropCollection still needs fixing somehow.

Esha pointed out that giving "admin" db to the high level sharding drop collection function hardcodes the config server as the target – see here and here curtesy of Esha.

Comment by Xiangyu Yao (Inactive) [ 26/Jun/18 ]

I just put some logs in the BF comments about the last few operations before crash in that thread. It might be helpful.
1. It's the "admin.mod1" collection on the config primary. I am now confused why that could happen if adding index only happen on the shard(s).
2. It's the normal dropCollection and the check is here.

Comment by Esha Maharishi (Inactive) [ 26/Jun/18 ]

xiangyu.yao, I'm confused by the new description:

3. There was another thread which just finished adding an index to the collection.
4. _configsvrDropCollection finally called dropCollection which did a check whether the numbers of indexes on disk and in memory were equal. Since the readSource was still "majority", we read the on-disk catalog with "majority" timestamp when the change to index catalog was not made yet. But the in-memory index catalog has the new index.

What is "the collection" in "adding an index to the collection"? Is it one of the config collections (e.g. config.collections or config.chunks), or the collection being dropped? If it's the collection being dropped, adding that index should happen on only on the shard(s) where the collection exists, not on the config server.

What is the "dropCollection" in "_configsvrDropCollection finally called dropCollection"? Is it the ShardingCatalogManager's dropCollection? Where is the check for "check whether the numbers of indexes on disk and in memory were equal" coming from?

Comment by Esha Maharishi (Inactive) [ 26/Jun/18 ]

We can update the ticket to say that, but I think it will get put into the "remove distlock" project, then. It basically requires using a separate opCtx whenever we need to do a majority read from a metadata command (dianna.hohensee, does that allow you to use a fresh RecoveryUnit for the read?)

Comment by Xiangyu Yao (Inactive) [ 26/Jun/18 ]

Got it. So the ticket should be to change _configsvrDropCollection command so that it doesn't trigger any majority reads.

Comment by Dianna Hohensee (Inactive) [ 26/Jun/18 ]

I suggest modifying the drop command to handle the issue, like we've done for other commands, by not setting majority on the RecoveryUnit until the end. I didn't communicate this clearly to Xiangyu. The ticket should be updated.

We don't currently have a way of unsetting the readConcern on the RecoveryUnit.

Comment by Xiangyu Yao (Inactive) [ 26/Jun/18 ]

We just need to set the readSource of the recoveryUnit from kMajorityCommitted back to kUnset. The fact that multiple operations use the same recoveryUnit is certainly not ideal but we can do this workaround.
dianna.hohensee mentioned that we had similar issues before.

Comment by Esha Maharishi (Inactive) [ 26/Jun/18 ]

xiangyu.yao, do you know if it is actually possible to set the read snapshot back to 'local' after setting it to 'majority'? My understanding was that this was not currently possible.

Generated at Thu Feb 08 04:41:03 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.