[SERVER-31685] Sharded change streams can miss notifications Created: 23/Oct/17  Updated: 30/Oct/23  Resolved: 14/Nov/17

Status: Closed
Project: Core Server
Component/s: Aggregation Framework, Replication, Sharding
Affects Version/s: None
Fix Version/s: 3.6.0-rc4

Type: Bug Priority: Major - P3
Reporter: Spencer Brody (Inactive) Assignee: Matthew Russotto
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: Repl 2017-11-13
Participants:

 Description   

When establishing a new change stream on a sharded collections, notifications can be missed until the stream has processed at least 1 notification from every shard. This is because when a shard receives a changeStream request, it marks the start point of its local stream as the current replica set majority commit point.
So if the cursor on shard A gets established at time 5, shard B takes a write to the collection at time 6, then the change stream cursor gets established on shard B at time 7, now the sharded stream will return notifications 5 and 7, but will miss notification 6.



 Comments   
Comment by Githook User [ 13/Nov/17 ]

Author:

{'name': 'Matthew Russotto', 'username': 'mtrussotto', 'email': 'matthew.russotto@10gen.com'}

Message: SERVER-31685 Sharded change streams can miss notifications
Branch: master
https://github.com/mongodb/mongo/commit/634857247b51b1d8ce68530636579480c67f4d19

Comment by Charlie Swanson [ 24/Oct/17 ]

spencer, I realized the first won't quite work out of the box, since you can open a change stream before the collection exists, but I think we could probably add an internal-only parameter to start at a ts, without a UUID or anything like that? We really just need all the oplog scans to agree on where to start.

Comment by Spencer Brody (Inactive) [ 23/Oct/17 ]

The alternative is to change the ARM to throw out all results before the highest first ResumeToken received from any shard, but that seems likely to be more complex.

Comment by Spencer Brody (Inactive) [ 23/Oct/17 ]

To fix this we can make all change stream cursors sent to shards on a sharded collection always use resumeAfter. The resume token can be artificially constructed based on the mongos' current cluster time and the collection UUID known from the sharding metadata.

This may require opting-out of any validation checks that ensure that there is an exact match on the ResumeToken

Generated at Thu Feb 08 04:27:51 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.