Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-32349

Resuming a sharded change stream when there are multiple changes with the same timestamp may be impossible

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Fixed
    • Affects Version/s: 3.6.0
    • Fix Version/s: 3.6.3, 3.7.2
    • Component/s: Aggregation Framework
    • Labels:
      None
    • Backwards Compatibility:
      Fully Compatible
    • Operating System:
      ALL
    • Backport Requested:
      v3.6
    • Sprint:
      Query 2018-01-15, Query 2018-01-29

      Description

      When resuming a change stream, we first need to make sure that the oplog has enough history to allow a resume. To do this, we make sure that the first thing in the stream is the resume token we expect. This works correctly in an unsharded environment, because we will start the oplog query with a ts: {$gte: <resume ts>} query, and each change's timestamp will be unique, so the first thing that comes back should be the change we're looking to resume after (if it's still possible to resume).

      Things are a bit different in a sharded scenario. To start, we don't know which shard is going to have the resume token, so we need to perform the check against the first document after merging the results from each shard. Secondly, the timestamp of each change is not guaranteed to be unique in a sharded cluster, two changes can happen 'simultaneously' (with the same timestamp) on two different shards. In situations like this, it's possible and legal that the first thing in the change stream will not be the resume token we're looking for, but rather a change that preceded the change we're resuming after and happened to have the same timestamp.

      To account for this, the logic that checks if we are able to resume (inside DocumentSourceEnsureResumeTokenPresent::getNext()) should allow arbitrarily many changes to occur before the resume token iff they have the same timestamp as the resume token. As soon as we see the resume token itself we know it's possible to resume, or if we see a change with a higher timestamp we know it's not possible to resume.

        Attachments

          Activity

            People

            Assignee:
            martin.neupauer Martin Neupauer
            Reporter:
            charlie.swanson Charlie Swanson
            Participants:
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved: