Loading...

XML

Word

Printable

JSON

Type: Improvement
Resolution: Fixed
Priority: Major - P3
Fix Version/s: 4.0.7, 4.1.9
Affects Version/s: None
Component/s: Aggregation Framework
Labels:
None

Backwards Compatibility:
Fully Compatible
Backport Requested:

v4.0
Sprint:
Query 2019-02-25
Linked BF Score:
0
Confidence Status:
None
Work Order:
3
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

Previously, for needsMerge:false streams on a single replica set, we permitted our results to be batched at the DocumentSourceCursor level; in other words, we would suck up to 4MB of data into the cursor before starting to pull those oplog events through the pipeline. We could do this because we did not need to track the oplog timestamp for single-replica streams. By contrast, for needsMerge:true streams that were producing data to be merged on mongoS, we had to include the $_internalLatestOplogTimestamp field. We therefore could not allow any batching, because pulling in 4MB of oplog events before starting to process the first of them would cause the latest oplog timestamp to jump 4MB ahead of the event that was actually being processed. Instead, we force the oplog scan to yield after every document, so the batch size is effectively 1 and the latest oplog timestamp stays in sync with the event being processed.

But for the change stream high water mark project, streams on a single replica set must also track the latest oplog timestamp; we need it in order to generate a high water mark token. And so this change within the SERVER-38408 commit extends the "no batching" rule to single replica streams. The upshot of this is that the stream is slower to return results (because we're yielding after every document) and more variable (because we are no longer guaranteed to return all available results up to 4MB on each getMore). This is also the underlying cause of ~~SERVER-38942~~.

We should allow both sharded and unsharded change stream cursors to batch their results, in order to improve the latency and consistency with which change stream events are provided to the client.

is duplicated by

SERVER-38942 Improve robustness of postBatchResumeToken integration tests

Closed

Assignee:: Bernard Gorman
Reporter:: Bernard Gorman
Participants:: Bernard Gorman, Githook User
Votes:: 0 Vote for this issue
Watchers:: 2 Start watching this issue

Created:: Feb 07 2019 02:26:16 PM UTC
Updated:: Oct 29 2023 10:24:23 PM UTC
Resolved:: Feb 21 2019 02:26:22 AM UTC

Details

Description

Attachments

Issue Links

Forms

Activity

People

Dates