Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-40094

Do not prematurely reject resume attempt in DSShardCheckResumability

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.0.7, 4.1.10
    • Component/s: Aggregation Framework
    • Labels:
      None
    • Backwards Compatibility:
      Fully Compatible
    • Operating System:
      ALL
    • Backport Requested:
      v4.0
    • Sprint:
      Query 2019-03-25
    • Linked BF Score:
      50

      Description

      In SERVER-38413, we changed DSShardCheckResumability such that it consumes any events at the same clusterTime as the specified resume token but which sort before it. To do so, it calls the compareAgainstClientResumeToken that was previously only called by the DSEnsureResumeTokenPresent stage.

      However, the applyOpsIndex logic in compareAgainstClientResumeToken still operates on the assumption that it is only called from DSEnsureResumeTokenPresent. As a result, in cases where a resume token of clusterTime T is sent to a shard whose oplog entry at time T is a multi-document transaction, the shard may incorrectly uassert on this line when it observes that the local event's applyOpsIndex is greater than the resume token's. We may also return an inaccurate result on this line if we are on a merging shard and we observe an event with the same clusterTime as the resume token but a different UUID.

        Attachments

          Issue Links

            Activity

              People

              • Votes:
                0 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: