Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-56763

Validate collection epoch when not holding a DB lock for $merge

    XMLWordPrintable

    Details

    • Type: Task
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 5.0.3, 5.1.0-rc0
    • Component/s: None
    • Labels:
      None
    • Backwards Compatibility:
      Fully Compatible
    • Backport Requested:
      v5.0
    • Sprint:
      Query Optimization 2021-06-14, Query Optimization 2021-06-28, Query Optimization 2021-07-12, Query Optimization 2021-07-26
    • Linked BF Score:
      43

      Description

      When fixing SERVER-54507 we discussed a possible future optimization in preparing to execute $merge. The idea is to check the targetCollectionVersion epoch against matching later on when you are not holding the DB lock, or call

      ShardServerProcessInterface::checkRoutingInfoEpochOrThrow()

      not under a DB lock right before execution of the query on the leaf nodes of the merge topology.

      Why would this be better? 

      The short answer:
      Because the shard is serving as a router and not as a shard, so this epoch check also doesn't matter theoretically.
      The long answer:
      Because the current check only makes sure that this MongoD (pretending to be a router) knows at least as much as the router which sent the merge command. However, in the grand scheme of things both can be wrong and agree on the same wrong thing.  It's more correct to have the leaf nodes of the merge topology do the epoch check or to call 

      ShardServerProcessInterface::checkRoutingInfoEpochOrThrow()

      Which avoids the case where the leaf nodes that do the data reads know about a dropped collection and the mongos doesn't at the time it sends the targetCollectionVersion to the mongod acting as the router.

      Note that there could still be a pathological case where the merge topology has 2 leaf nodes and one is reached much earlier than the second, and the first one processes Petabytes of data when the collection is dropped on the second leaf. The only theoretical way to get around this is to probably open cursors on all shards that will participate in the merge plan, but that would be possibly infeasible.

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              nicholas.zolnierz Nicholas Zolnierz
              Reporter:
              eric.cox Eric Cox
              Participants:
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: