[SERVER-31978] Add an invariant that DocumentSourceCloseCursor does not execute on a mongod for a sharded $changeStream Created: 15/Nov/17 Updated: 30/Oct/23 Resolved: 20/Nov/17 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Aggregation Framework, Querying |
| Affects Version/s: | None |
| Fix Version/s: | 3.6.1, 3.7.1 |
| Type: | Task | Priority: | Major - P3 |
| Reporter: | David Storch | Assignee: | David Storch |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||
| Backport Requested: |
v3.6
|
||||||||||||||||
| Sprint: | Query 2017-12-04 | ||||||||||||||||
| Participants: | |||||||||||||||||
| Linked BF Score: | 0 | ||||||||||||||||
| Description |
|
DocumentSourceCloseCursor is part of the internal $changeStream machinery. It is used to close change stream cursors that have been invalidated, due to an event such as a collection drop or database drop. DocumentSourceCloseCursor should always run on the mongos in the case that the $changeStream is run in a sharded configuration. This is because the mongos cursor manager is not prepared to correctly handle its child cursor being closed out from under it. Instead, the cursor should be closed via the DocumentSourceCloseCursor running on mongos. Cleanup of the mongos cursor should cause the underlying cursors on the shards to be cleaned up as well. As of commit d4a526fdcf under |
| Comments |
| Comment by Githook User [ 06/Dec/17 ] | |||||||||||||||||
|
Author: {'name': 'David Storch', 'username': 'dstorch', 'email': 'david.storch@10gen.com'}Message: (cherry picked from commit 11c3a16c20532b77e6e8a2b45ddb18c45913699d) | |||||||||||||||||
| Comment by Githook User [ 20/Nov/17 ] | |||||||||||||||||
|
Author: {'name': 'David Storch', 'username': 'dstorch', 'email': 'david.storch@10gen.com'}Message: | |||||||||||||||||
| Comment by David Storch [ 15/Nov/17 ] | |||||||||||||||||
|
The problem with running DocumentSourceCloseCursors on mongod in sharding is subtle, and was exposed by our continuous integration testing. The problem is a race between the check for remotesExhausted() and the awaitData timeout in RouterStageMerge. I can reproduce reliably the failure at githash f19da233fa after applying the following patch:
This patch instruments the code with a sleep to induce the following sequence of events:
|