[SERVER-47810] Resume token returned by mongoS can be earlier than user-specified resume point Created: 27/Apr/20  Updated: 29/Oct/23  Resolved: 22/May/20

Status: Closed
Project: Core Server
Component/s: Aggregation Framework
Affects Version/s: None
Fix Version/s: 4.4.0-rc8, 4.7.0

Type: Bug Priority: Major - P3
Reporter: Bernard Gorman Assignee: Bernard Gorman
Resolution: Fixed Votes: 0
Labels: qexec-team
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v4.4
Sprint: Query 2020-05-04, Query 2020-05-18, Query 2020-06-01
Participants:

 Description   

In cases where a resume token or starting time is specified when opening a change stream, the postBatchResumeToken returned with the first batch must always be at least equal to the specified resume point, even if the batch itself is empty. However, if the user opens a change stream on mongoS with a startAtOperationTime at a point in the future (which is perfectly legal), then the stream will return high-water-mark PBRTs that reflect the current clusterTime rather than waiting until the clusterTime exceeds startAtOperationTime.



 Comments   
Comment by Githook User [ 29/May/20 ]

Author:

{'name': 'Bernard Gorman', 'email': 'bernard.gorman@gmail.com', 'username': 'gormanb'}

Message: SERVER-47810 Make $changeStream shard-monitor cursor ineligible to contribute high water marks

(cherry picked from commit 35756d5b0fe1bc810de1d740950b2fa41e449bdd)
Branch: v4.4
https://github.com/mongodb/mongo/commit/3da789cfbfd211ecc26ed780db14695ee54f14a9

Comment by Githook User [ 22/May/20 ]

Author:

{'name': 'Bernard Gorman', 'email': 'bernard.gorman@gmail.com', 'username': 'gormanb'}

Message: SERVER-47810 Make $changeStream shard-monitor cursor ineligible to contribute high water marks
Branch: master
https://github.com/mongodb/mongo/commit/35756d5b0fe1bc810de1d740950b2fa41e449bdd

Comment by Bernard Gorman [ 13/May/20 ]

charlie.swanson: yep, this looks accurate to me.

Comment by Charlie Swanson [ 13/May/20 ]

As part of working on this, we discovered another special case for the change stream's cursor on the config server. That cursor may occasionally return "addShard" events which the change stream uses to keep the stream open on all shards. The event is swallowed internally and not returned to the user. Such an event should be prevented from becoming the high water mark. We decided this because:

1) The resume token for this event is a "real" token, meaning it's not a manufactured high water mark token and we can expect to find it in an oplog. Our logic for resuming the stream will expect to see the event again to make sure we can resume. 

2) Given the current order of checking for resume and checking for addShard, the resume token check would never see the event and would fail.

3) If we instead flipped that order, then in order to successfully resume you need to be sure to read that event from the config servers. This would cause problems because

   (a) The window of history on the config server may be small and ideally shouldn't be a factor in whether you can successfully resume a stream.

   (b) The cursor we open on the config servers is usually opened at a "recent" clusterTime, ignoring the resume token. The stream is only there to detect new shards, so otherwise doesn't need to go back and read old history. Determining the correct time to open the cursor on the config servers is already difficult to get right; we don't want to complicate it further.

 

bernard.gorman does the above accurately reflect our conversation?

Generated at Thu Feb 08 05:15:18 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.