[SERVER-62101] Aggregation can run lock-free and expects a ViewCatalog access separate from the AutoGet*MaybeLockFree to always return a valid ViewCatalog -- not guaranteed Created: 16/Dec/21  Updated: 29/Oct/23  Resolved: 05/Jan/22

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 5.3.0

Type: Bug Priority: Major - P3
Reporter: Dianna Hohensee (Inactive) Assignee: Dianna Hohensee (Inactive)
Resolution: Fixed Votes: 0
Labels: read-only-views
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Related
is related to SERVER-63684 Rollback SERVER-62101's work now that... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: Execution Team 2021-12-27, Execution Team 2022-01-10
Participants:
Linked BF Score: 145

 Description   

This code does DatabaseHolder::getViewCatalog()->resolveView without checking that getViewCatalog doesn't return a nullptr. It makes this assumption because the code is gated by an autoGet->getView check.

Two solution here.

1) Aggregation can error when it finds the ViewCatalog no longer exists. (I rather like the simplicity). This means the collection didn't exist when the command started, and then the view went away somehow during the command. Aggregate is just about to drop locks anyway, which means anything can happen after view resolution

2) Lock-free operations must support ViewCatalog::resolveView with the same ViewCatalog used to fetch the view – likely save the ViewCatalog shared_ptr on the AutoGet*LockFree, if we find a view.

 



 Comments   
Comment by Githook User [ 04/Jan/22 ]

Author:

{'name': 'Dianna Hohensee', 'email': 'dianna.hohensee@mongodb.com', 'username': 'DiannaHohensee'}

Message: SERVER-62101 Check whether the database was dropped before accessing the ViewCatalog in the aggregate command; and use a lock-free compatible collection lookup in ViewCatalog::resolveView()
Branch: master
https://github.com/mongodb/mongo/commit/fdca8a8d628d5480e3f552f86ce89aa0d234741f

Comment by Dianna Hohensee (Inactive) [ 04/Jan/22 ]

The failures started on SERVER-60672. The issue isn't actually the ViewCatalog, though I think that change still needs to be made. The issue is the use of lookupCollectionByNamespace under a AutoGet*MaybeLockFree helper: instead we need to use a lookupCollectionByNamespaceForRead that returns a shared_ptr<Collection>, instead of the raw Collection* that lookupCollectionByNamespace uses in the CollectionPtr it returns.

Comment by Pavithra Vetriselvan [ 17/Dec/21 ]

We should consider going with the first solution. We should also try to understand why this started happening recently.

Generated at Thu Feb 08 05:54:11 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.