[SERVER-53449] Fix barrier in waitForShardCursor function in change_streams_shards_start_in_sync.js Created: 18/Dec/20 Updated: 29/Oct/23 Resolved: 28/May/21 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | 5.0.0-rc1, 5.1.0-rc0 |
| Type: | Bug | Priority: | Minor - P4 |
| Reporter: | Lamont Nelson | Assignee: | Bernard Gorman |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | post-rc0, sharding-wfbf-day | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||
| Operating System: | ALL | ||||||||
| Backport Requested: |
v5.0, v4.4, v4.2, v4.0
|
||||||||
| Participants: | |||||||||
| Linked BF Score: | 24 | ||||||||
| Description |
|
The waitForCursor function acts as a barrier in the mentioned test by querying currentOp to determine if the cursor has been established. It uses the following expression to determine this:
In failing runs, this function exits early meaning that some other code has established an idle cursor according to the criteria above. We should change the test so that the condition is strong enough to only consider the cursor established in the test. |
| Comments |
| Comment by Vivian Ge (Inactive) [ 06/Oct/21 ] |
|
Updating the fixversion since branching activities occurred yesterday. This ticket will be in rc0 when it’s been triggered. For more active release information, please keep an eye on #server-release. Thank you! |
| Comment by Githook User [ 29/May/21 ] |
|
Author: {'name': 'Bernard Gorman', 'email': 'bernard.gorman@gmail.com', 'username': 'gormanb'}Message: (cherry picked from commit d5d07016c84f19b4fb95a355d5a4a5cc9f8e1442) |
| Comment by Githook User [ 28/May/21 ] |
|
Author: {'name': 'Bernard Gorman', 'email': 'bernard.gorman@gmail.com', 'username': 'gormanb'}Message: |
| Comment by Lamont Nelson [ 18/Dec/20 ] |
|
I did not have a chance to verify the offending code that establishes the cursor (hence the general language), but wrote this ticket since we have enough evidence to make the conclusion. It seems better than letting it sit around for another week (or two due to the holidays). Since we are already modifying this test, logging the currentOp response that causes the barrier to be released may help any future BF investigators. |
| Comment by Max Hirschhorn [ 18/Dec/20 ] |
|
lamont.nelson, did you end up confirming that the $indexStats aggregation pipeline run by the PeriodicShardedIndexConsistencyChecker as the offending unexpected idle cursor? Maybe the originatingCommand field can be used to distinguish the change stream cursor. |