[SERVER-31685] Sharded change streams can miss notifications Created: 23/Oct/17 Updated: 30/Oct/23 Resolved: 14/Nov/17 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Aggregation Framework, Replication, Sharding |
| Affects Version/s: | None |
| Fix Version/s: | 3.6.0-rc4 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Spencer Brody (Inactive) | Assignee: | Matthew Russotto |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Backwards Compatibility: | Fully Compatible |
| Operating System: | ALL |
| Sprint: | Repl 2017-11-13 |
| Participants: |
| Description |
|
When establishing a new change stream on a sharded collections, notifications can be missed until the stream has processed at least 1 notification from every shard. This is because when a shard receives a changeStream request, it marks the start point of its local stream as the current replica set majority commit point. |
| Comments |
| Comment by Githook User [ 13/Nov/17 ] |
|
Author: {'name': 'Matthew Russotto', 'username': 'mtrussotto', 'email': 'matthew.russotto@10gen.com'}Message: |
| Comment by Charlie Swanson [ 24/Oct/17 ] |
|
spencer, I realized the first won't quite work out of the box, since you can open a change stream before the collection exists, but I think we could probably add an internal-only parameter to start at a ts, without a UUID or anything like that? We really just need all the oplog scans to agree on where to start. |
| Comment by Spencer Brody (Inactive) [ 23/Oct/17 ] |
|
The alternative is to change the ARM to throw out all results before the highest first ResumeToken received from any shard, but that seems likely to be more complex. |
| Comment by Spencer Brody (Inactive) [ 23/Oct/17 ] |
|
To fix this we can make all change stream cursors sent to shards on a sharded collection always use resumeAfter. The resume token can be artificially constructed based on the mongos' current cluster time and the collection UUID known from the sharding metadata. This may require opting-out of any validation checks that ensure that there is an exact match on the ResumeToken |