Loading...

XML

Word

Printable

JSON

Type: Improvement
Resolution: Unresolved
Priority: Major - P3
Fix Version/s: None
Affects Version/s: None
Component/s: Aggregation Framework
Labels:
- change-streams-improvements

Assigned Teams:

Query Execution
Case:
Confidence Status:
None
Work Order:
3
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

If a change stream is resumed in a sharded cluster, all shards that have no oplog entries for the relevant namespace (either because they never had any data for the collection, or because they have not taken any writes recently) will scan through their oplogs until reaching EOF before returning their first getMore batch to mongoS. If the oplogs are large and the requested resume point is early, then this may take so long that the cursors on shards which do have oplog entries time out, effectively making the stream unresumable. This problem could manifest even after SERVER-30784 is complete, although the likelihood of such an occurrence will be significantly reduced.

Addressing this may require changing the semantics of tailable-awaitData getMores on mongoD. Currently, if there are no matching entries, getMore will scan to the end of the oplog regardless of how long it takes. We could instead have it proactively break and return after the specific awaitData period expires, much like a non-throwing maxTimeMS deadline. This would also be consistent with mongoS' current behaviour.

Assignee:: [DO NOT USE] Backlog - Query Execution
Reporter:: Bernard Gorman
Participants:: [DO NOT USE] Backlog - Query Execution, Bernard Gorman
Votes:: 0 Vote for this issue
Watchers:: 14 Start watching this issue

Created:: Jun 01 2020 08:00:50 PM UTC
Updated:: Dec 06 2022 02:25:15 AM UTC

Details

Description

Attachments

Forms

Activity

People

Dates