Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-39410

Re-enable batching in DSCursor for change stream cursors

    • Type: Icon: Improvement Improvement
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 4.0.7, 4.1.9
    • Affects Version/s: None
    • Component/s: Aggregation Framework
    • None
    • Fully Compatible
    • v4.0
    • Query 2019-02-25
    • 0

      Previously, for needsMerge:false streams on a single replica set, we permitted our results to be batched at the DocumentSourceCursor level; in other words, we would suck up to 4MB of data into the cursor before starting to pull those oplog events through the pipeline. We could do this because we did not need to track the oplog timestamp for single-replica streams. By contrast, for needsMerge:true streams that were producing data to be merged on mongoS, we had to include the $_internalLatestOplogTimestamp field. We therefore could not allow any batching, because pulling in 4MB of oplog events before starting to process the first of them would cause the latest oplog timestamp to jump 4MB ahead of the event that was actually being processed. Instead, we force the oplog scan to yield after every document, so the batch size is effectively 1 and the latest oplog timestamp stays in sync with the event being processed.

      But for the change stream high water mark project, streams on a single replica set must also track the latest oplog timestamp; we need it in order to generate a high water mark token. And so this change within the SERVER-38408 commit extends the "no batching" rule to single replica streams. The upshot of this is that the stream is slower to return results (because we're yielding after every document) and more variable (because we are no longer guaranteed to return all available results up to 4MB on each getMore). This is also the underlying cause of SERVER-38942.

      We should allow both sharded and unsharded change stream cursors to batch their results, in order to improve the latency and consistency with which change stream events are provided to the client.

            Assignee:
            bernard.gorman@mongodb.com Bernard Gorman
            Reporter:
            bernard.gorman@mongodb.com Bernard Gorman
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: