[SERVER-32566] Add test coverage for killCursors in presence of stepdowns Created: 05/Jan/18  Updated: 06/Dec/22  Resolved: 12/Jan/18

Status: Closed
Project: Core Server
Component/s: Querying
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Ian Boros Assignee: Backlog - Query Team (Inactive)
Resolution: Won't Fix Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Assigned Teams:
Query
Participants:

 Description   

There is no test which runs killCursors with stepdowns.

We should either
1) Modify the core/kill_cursors.js to work when there are stepdowns. After SERVER-21710 it will no longer need the requires_getmore tag, which currently prevents it from running under retryable_writes_jscore_stepdown_passthrough. Below is a description of why even without using getMore, we currently can't run core/killCursors.js with stepdowns.
2) Add another test which specifically runs killCursors after a stepdown.

Here is why we cannot run the current core/killCursors.js test with stepdowns:

There is some code like the following:

    // Test killing a noTimeout cursor.
    cmdRes = db.runCommand({find: coll.getName(), batchSize: 2, noCursorTimeout: true});
    assert.commandWorked(cmdRes);
    cursorId = cmdRes.cursor.id;
    assert.neq(cursorId, NumberLong(0));
 
    // <Here>
 
    cmdRes = db.runCommand({killCursors: coll.getName(), cursors: [NumberLong(123), cursorId]});
    assert.commandWorked(cmdRes);
    assert.eq(cmdRes.cursorsKilled, [cursorId]);
    assert.eq(cmdRes.cursorsNotFound, [NumberLong(123)]);
    assert.eq(cmdRes.cursorsAlive, []);
    assert.eq(cmdRes.cursorsUnknown, []);

If a stepdown happens in the location marked <Here> and the killCursors command is run on the new primary, the command will respond with "cursorsNotFound" for both cursor IDs.

You can easily reproduce this by adding a sleep where I marked <Here> and then run the relevant code in a while (1) loop.

To fix this, we could have the test run killCursors against the same node which it ran find() on using the _mongo field in a command response.



 Comments   
Comment by Ian Whalen (Inactive) [ 12/Jan/18 ]

Historically we haven't seen much bugginess in this part of the code base so not planning on pursuing this additional test coverage for now.

Comment by Ian Boros [ 08/Jan/18 ]

I've updated the ticket.

Comment by David Storch [ 08/Jan/18 ]

ian.boros, this issue is marked as "Bug", but I don't think that's accurate. If I understand correctly, kill_cursors.js is written to assume that there are no stepdowns, but then is tagged with requires_getmore, which correctly excludes it from all test suites that involve stepdowns.

If I understand correctly, this ticket really just boils down to "Add test coverage for correct killCursors behavior when there is a stepdown". If so, please update the title, description, and issue type to reflect this.

Generated at Thu Feb 08 04:30:37 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.