[SERVER-69389] Command checkAuthorization may throw ErrorCodes::NamespaceNotFound for existing collection while trying to resolve UUID to namespace when the node is shutting down. Created: 01/Sep/22  Updated: 29/Oct/23  Resolved: 15/Sep/22

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 4.4.18, 5.0.14, 6.0.3, 6.1.0-rc4, 6.2.0-rc0

Type: Bug Priority: Major - P3
Reporter: Suganthi Mani Assignee: Gregory Noma
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
Related
is related to SERVER-33632 Make UUID catalog reload atomic Closed
is related to SERVER-69627 Clean up catalog close and storage en... Blocked
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v6.1, v6.0, v5.0, v4.4
Sprint: Execution Team 2022-09-19
Participants:
Linked BF Score: 11

 Description   

Nodes can throw ErrorCodes::NamespaceNotFound for existing collections while trying to resolve UUID to namespace when the node is shutting down and has finished deregistering all the collections and clearing the CollectionCatalog::_catlog map from the in-memory collection catalog. There are few commands, like, count, tries to resolve UUID to namespace as part of the authorization check without holding any locks. ErrorCodes::NamespaceNotFound makes the multi-tenant migration protocol to skip cloning the collection from donor to recipient as the recipient assumes that the collection was dropped on the donor. And, the tenant migration may still go ahead and commit without copying all collections of the tenant from donor to recipient and leading to data corruption.

Note: This a problem with logical initial sync as well.



 Comments   
Comment by Githook User [ 17/Oct/22 ]

Author:

{'name': 'Gregory Noma', 'email': 'gregory.noma@gmail.com', 'username': 'gregorynoma'}

Message: SERVER-69389 Populate shadow collection catalog on clean shutdown

(cherry picked from commit 5457b4527960627071d26310111b29510105d42f)
Branch: v4.4
https://github.com/mongodb/mongo/commit/1ff0f181d863dc3c27c8c4ae1bc9a6b174f09a15

Comment by Githook User [ 04/Oct/22 ]

Author:

{'name': 'Gregory Noma', 'email': 'gregory.noma@gmail.com', 'username': 'gregorynoma'}

Message: SERVER-69389 Populate shadow collection catalog on clean shutdown

(cherry picked from commit 5457b4527960627071d26310111b29510105d42f)
Branch: v6.0
https://github.com/mongodb/mongo/commit/8839b9fe60369c98f70349796b46b89e9b56580c

Comment by Githook User [ 04/Oct/22 ]

Author:

{'name': 'Gregory Noma', 'email': 'gregory.noma@gmail.com', 'username': 'gregorynoma'}

Message: SERVER-69389 Populate shadow collection catalog on clean shutdown

(cherry picked from commit 5457b4527960627071d26310111b29510105d42f)
Branch: v5.0
https://github.com/mongodb/mongo/commit/264aff7f5aa9f5c4d73b8c7cda66d4fb93be3711

Comment by Githook User [ 03/Oct/22 ]

Author:

{'name': 'Gregory Noma', 'email': 'gregory.noma@gmail.com', 'username': 'gregorynoma'}

Message: SERVER-69389 Populate shadow collection catalog on clean shutdown

(cherry picked from commit 5457b4527960627071d26310111b29510105d42f)
Branch: v6.1
https://github.com/mongodb/mongo/commit/b6e0525d6de06526fb5f1e99fb842ad144fff02a

Comment by Githook User [ 15/Sep/22 ]

Author:

{'name': 'Gregory Noma', 'email': 'gregory.noma@gmail.com', 'username': 'gregorynoma'}

Message: SERVER-69389 Populate shadow collection catalog on clean shutdown
Branch: master
https://github.com/mongodb/mongo/commit/5457b4527960627071d26310111b29510105d42f

Comment by Suganthi Mani [ 05/Sep/22 ]

We had a similar problem before where a user thread trying to resolve UUID to namespace races with rollback thread trying to close the catalog. We fixed the problem by introducing shadow catalog to allow such name resolution (see SERVER-33632). I am proposing a similar fix to solve this problem - Introduce a new boolean variable CollectionCatalog::_isShuttingDown which will be marked to true on deregistering collections due to shutdown. And, CollectionCatalog::lookupNSSByUUID() should throw “ErrorCodes::ShutdownInProgress” if CollectionCatalog::_isShuttingDown is true . Since catalog updates are copy-on-write, it ensures that clearing the _catlog map and setting _isShuttingDown true are done atomically.

CC esha.maharishi@mongodb.com for Serverless visibility & matthew.russotto@mongodb.com for replication visibility as I believe logical initial sync has this same problem.

Generated at Thu Feb 08 06:13:20 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.