[SERVER-53449] Fix barrier in waitForShardCursor function in change_streams_shards_start_in_sync.js Created: 18/Dec/20  Updated: 29/Oct/23  Resolved: 28/May/21

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 5.0.0-rc1, 5.1.0-rc0

Type: Bug Priority: Minor - P4
Reporter: Lamont Nelson Assignee: Bernard Gorman
Resolution: Fixed Votes: 0
Labels: post-rc0, sharding-wfbf-day
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v5.0, v4.4, v4.2, v4.0
Participants:
Linked BF Score: 24

 Description   

The waitForCursor function acts as a barrier in the mentioned test by querying currentOp to determine if the cursor has been established. It uses the following expression to determine this:

function waitForShardCursor(rs) {
    assert.soon(() => rs.getPrimary()
                          .getDB('admin')
                          .aggregate([
                              {"$currentOp": {"idleCursors": true}},
                              {"$match": {ns: mongosColl.getFullName(), type: "idleCursor"}}
 
                          ])
                          .itcount() === 1);
}

In failing runs, this function exits early meaning that some other code has established an idle cursor according to the criteria above. We should change the test so that the condition is strong enough to only consider the cursor established in the test.



 Comments   
Comment by Vivian Ge (Inactive) [ 06/Oct/21 ]

Updating the fixversion since branching activities occurred yesterday. This ticket will be in rc0 when it’s been triggered. For more active release information, please keep an eye on #server-release. Thank you!

Comment by Githook User [ 29/May/21 ]

Author:

{'name': 'Bernard Gorman', 'email': 'bernard.gorman@gmail.com', 'username': 'gormanb'}

Message: SERVER-53449 Robustify change_streams_shards_start_in_sync.js and tag as does_not_support_stepdowns

(cherry picked from commit d5d07016c84f19b4fb95a355d5a4a5cc9f8e1442)
Branch: v5.0
https://github.com/mongodb/mongo/commit/8910589334a1494dc56442eb280926f566cdd60d

Comment by Githook User [ 28/May/21 ]

Author:

{'name': 'Bernard Gorman', 'email': 'bernard.gorman@gmail.com', 'username': 'gormanb'}

Message: SERVER-53449 Robustify change_streams_shards_start_in_sync.js and tag as does_not_support_stepdowns
Branch: master
https://github.com/mongodb/mongo/commit/d5d07016c84f19b4fb95a355d5a4a5cc9f8e1442

Comment by Lamont Nelson [ 18/Dec/20 ]

I did not have a chance to verify the offending code that establishes the cursor (hence the general language), but wrote this ticket since we have enough evidence to make the conclusion. It seems better than letting it sit around for another week (or two due to the holidays).

Since we are already modifying this test, logging the currentOp response that causes the barrier to be released may help any future BF investigators.

Comment by Max Hirschhorn [ 18/Dec/20 ]

lamont.nelson, did you end up confirming that the $indexStats aggregation pipeline run by the PeriodicShardedIndexConsistencyChecker as the offending unexpected idle cursor? Maybe the originatingCommand field can be used to distinguish the change stream cursor.

Generated at Thu Feb 08 05:30:57 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.