[SERVER-34554] Database drops do not have a total ordering in a change stream Created: 18/Apr/18 Updated: 06/Dec/22 |
|
| Status: | Backlog |
| Project: | Core Server |
| Component/s: | Aggregation Framework |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor - P4 |
| Reporter: | Charlie Swanson | Assignee: | Backlog - Query Execution |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | change-streams-improvements, query-44-grooming | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||
| Assigned Teams: |
Query Execution
|
||||||||||||
| Operating System: | ALL | ||||||||||||
| Participants: | |||||||||||||
| Description |
|
In a sharded cluster it may be possible for two database drops to happen on two different shards at the same time. Since a database does not have a UUID or a documentKey, the resume token for the drop will simply contain a timestamp. This means that is possible for two distinct changes to have the same resume token. A database drop will invalidate the stream, so this likely isn't an issue, but if we allow an actual notification of this event without an invalidation, then a client who tries to resume with one of the two identical resume tokens might see a drop twice or not at all. |
| Comments |
| Comment by Nicholas Zolnierz [ 28/Jun/18 ] |
|
Bumping out of the Metadata Notifications epic, but planning to circle back as part of |
| Comment by Spencer Brody (Inactive) [ 25/Apr/18 ] |
|
Also is it weird that now a drop of a sharded collection can give a notification for each shard? Should we be de-duping them somehow? |
| Comment by Charlie Swanson [ 25/Apr/18 ] |
|
Yes, they each represent the same event, so the impact is probably very low if it exists at all. But it's still technically a gap in the global ordering. |
| Comment by Asya Kamsky [ 25/Apr/18 ] |
|
But those represent the exact same event, right? (collection drop globally which just happened to have been executed on every shard, is that right?) |
| Comment by Charlie Swanson [ 25/Apr/18 ] |
|
spencer also points out that this is a problem for drops of a sharded collection, for which there will be multiple entries all with the same UUID, and potentially the same timestamp. |
| Comment by Asya Kamsky [ 23/Apr/18 ] |
|
Right, it also seems to be a non-problem since before dropping database all collection drops will show up in the stream. So it reduces the importance of guarantying that a database drop is seen exactly once. |
| Comment by Charlie Swanson [ 18/Apr/18 ] |
|
cc schwerin and spencer per our in-person discussion we believe that this might not be a problem in practice because the config servers are responsible for driving a database drop, and so there may be an imposed ordering that would guarantee no two drops can have the same cluster time? This needs further investigation. |