[SERVER-42232] Adding a new shard renders all preceding resume tokens invalid Created: 15/Jul/19  Updated: 29/Oct/23  Resolved: 18/Jul/19

Status: Closed
Project: Core Server
Component/s: Aggregation Framework
Affects Version/s: None
Fix Version/s: 4.0.11, 4.2.0-rc4, 4.3.1

Type: Bug Priority: Major - P3
Reporter: Bernard Gorman Assignee: Bernard Gorman
Resolution: Fixed Votes: 2
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Problem/Incident
Related
is related to SERVER-78321 MongoDB 6.0: Adding a new shard rende... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v4.2, v4.0
Sprint: Query 2019-07-29
Participants:
Case:

 Description   

In DocumentSourceShardCheckResumability, we verify that the first entry in each shard's oplog precedes the resume token in order to guarantee that the resumed stream does not skip any events. If we are resuming from a point in time before one of the shards in the cluster was added, then the first entry in that shard's oplog will always be later than the resume token, and will always fail this check. This renders the stream unresumable from any point before the shard was added.



 Comments   
Comment by Githook User [ 19/Jul/19 ]

Author:

{'name': 'Bernard Gorman', 'username': 'gormanb', 'email': 'bernard.gorman@gmail.com'}

Message: SERVER-42232 Adding a new shard renders all preceding resume tokens invalid

(cherry picked from commit ffdb59938db0dfc8ec48e8b74df7a54d07b3a128)
Branch: v4.0
https://github.com/mongodb/mongo/commit/417d1a712e9f040d54beca8e4943edce218e9a8c

Comment by Githook User [ 19/Jul/19 ]

Author:

{'name': 'Bernard Gorman', 'username': 'gormanb', 'email': 'bernard.gorman@gmail.com'}

Message: SERVER-42232 Adding a new shard renders all preceding resume tokens invalid

(cherry picked from commit ffdb59938db0dfc8ec48e8b74df7a54d07b3a128)
Branch: v4.2
https://github.com/mongodb/mongo/commit/c32930ff6eccb53c5f999d292e33591d4b319f39

Comment by Githook User [ 18/Jul/19 ]

Author:

{'name': 'Bernard Gorman', 'email': 'bernard.gorman@gmail.com', 'username': 'gormanb'}

Message: SERVER-42232 Adding a new shard renders all preceding resume tokens invalid
Branch: master
https://github.com/mongodb/mongo/commit/ffdb59938db0dfc8ec48e8b74df7a54d07b3a128

Comment by Bernard Gorman [ 16/Jul/19 ]

schwerin: yes, setting an initial timestamp to the dawn of time would also work - but as you say, that logic would need to be backported to at least release N-1 in case we initiate a set and then upgrade it. The "initiating set" entry will function exactly the same for this purpose, and it already exists in the same form in every release as far back as 2.0.0.

Comment by Andy Schwerin [ 16/Jul/19 ]

bernard.gorman, could this be fixed if replica sets were always initiated with a timestamp at the beginning of time, say Timestamp(1, 1)? That way, any change stream request that hit a recently added shard would see that the oplog on the new shard went back to the dawn of time.

Edit The alternative solution, which I believe Bernard has in mind, is to treat the "replica set initiate" oplog entry as a sentinel whose semantic meaning is "there are no older writes that this one." That's probably a more flexible solution, since you could use that solution by upgrading binaries even if the shards involved had been created with older binaries.

Comment by Charlie Swanson [ 16/Jul/19 ]

bernard.gorman assigning this to you as discussed.

Generated at Thu Feb 08 04:59:56 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.