Data race involving capped collection truncation and PlanExecutor kill notifications

XMLWordPrintableJSON

    • Type: Bug
    • Resolution: Done
    • Priority: Major - P3
    • 3.5.7
    • Affects Version/s: None
    • Component/s: Querying, Storage
    • None
    • Fully Compatible
    • ALL
    • Query 2017-05-08
    • None
    • 3
    • None
    • None
    • None
    • None
    • None
    • None

      There are several events which can cause all active query plan executors to be marked as killed:

      • Collection drop.
      • Database drop.
      • Index drop.
      • Capped collection truncation.

      Generally, these events involve the acquisition of a MODE_X lock on the collection, which means that any active queries must have yielded all locks. After obtaining exclusive access to the collection, we iterate the list of registered PlanExecutors and mark them as killed:

      https://github.com/mongodb/mongo/blob/r3.5.4/src/mongo/db/catalog/cursor_manager.cpp#L333-L339

      This writes to PlanExecutor::_killReason. Whenever a PlanExecutor is used, it first consults PlanExecutor::_killReason. If the kill reason is set, an error is propagated to the caller. This means that the client will receive the appropriate error if, for example, the query's collection is dropped during its execution.

      The path for capped collection truncation, however, only requires a MODE_IX lock:

      https://github.com/mongodb/mongo/blob/r3.5.4/src/mongo/db/catalog/collection.cpp#L923

      This means that the thread calling Collection::cappedTruncateAfter() can be writing to PlanExecutor::_killReason at the same time that the PlanExecutor is reading it!

            Assignee:
            David Storch
            Reporter:
            David Storch
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: