Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Fixed
Priority: Major - P3
Fix Version/s: 4.2.0-rc3, 4.3.1
Affects Version/s: None
Component/s: Querying
Labels:
- query-44-grooming

Backwards Compatibility:
Fully Compatible
Operating System:
ALL
Backport Requested:

v4.2
Sprint:
Query 2019-06-17, Query 2019-07-01
Linked BF Score:
59
Confidence Status:
None
Work Order:
3
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

Queries periodically yield and recover their intent locks using Locker::LockSnapshot. After restoring the lock snapshot, PlanExecutor::restoreState() is called in order to make several validity checks. As part of this yield recovery process, we make a check for collection rename and throw an exception if the Collection object's namespace string has changed during yield.

It is possible that this check is made without holding the correct lock on the Collection object. The resource IDs which we lock are collection names, not collection UUIDs. Therefore, if a collection is renamed, the identity of the lock that must be held to protect the Collection object also changes. Consider the case of a collection being renamed from A to B during yield. When the lock snapshot is restored, the query execution subsystem will continue to hold the lock on A, not B. This means that the call to Collection::ns() may not be correctly synchronized with other threads executing drops or renames on that collection.

Credit to geert.bosch for helping me discover this issue and think through a fix. The easiest fix would be to use CollectionCatalog::lookupNSSByUUID() instead of CollectionCatalog::lookupCollectionByUUID() in order to safely identify whether a rename has occurred. If the namespace string has not changed, then you know that the correct locks were restored from the lock snapshot and query execution can proceed. Otherwise, the query will be killed.

Additional work will be required if we ever want to allow queries to survive renames. In particular, we may want to change query yielding so that it doesn't use the Locker::LockSnapshot interface, but rather uses the db_raii.h helpers directly on yield recovery.

Assignee:: David Storch
Reporter:: David Storch
Participants:: David Storch, Geert Bosch, Githook User
Votes:: 0 Vote for this issue
Watchers:: 4 Start watching this issue

Created:: May 17 2019 05:41:08 PM UTC
Updated:: Oct 29 2023 10:20:54 PM UTC
Resolved:: Jun 26 2019 04:26:46 PM UTC
Confidence Status Last Update:: 25/Jun/19 1:48 PM

Details

Description

Attachments

Forms

Activity

People

Dates