[SERVER-57067] [SBE] Find command unexpectedly returns cursorId when final batch size aligns with result set size Created: 19/May/21  Updated: 30/Nov/23  Resolved: 27/May/21

Status: Closed
Project: Core Server
Component/s: Query Execution
Affects Version/s: 5.0.0-rc0
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Kyle Suarez Assignee: Nikita Lapkov (Inactive)
Resolution: Works as Designed Votes: 0
Labels: post-rc0, sbe-post-rc0
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
is depended on by PYTHON-2730 Test failures due unnecessary getMore... Closed
Duplicate
is duplicated by SERVER-80713 ID on exhausted cursor no longer 0 Closed
is duplicated by SERVER-83077 Check one getNext beyond batchSize fo... Closed
Related
related to SERVER-56094 [SBE][retryable_writes_jscore_stepdow... Closed
related to SERVER-56099 [SBE] Fix all PlanStage getNext() met... Closed
Operating System: ALL
Steps To Reproduce:

Run the server with SBE enabled (the default in 5.0.0-rc0). Then:

(function() {
    "use strict";
    const collname = "cursor_test";
    const coll = db.getCollection(collname);
    coll.drop();
 
    assert.writeOK(coll.insert({_id: 0}));
    assert.writeOK(coll.insert({_id: 1}));
 
    const find_response = coll.runCommand("find", {batchSize: 1});
    const cursor_id = find_response.cursor.id;                                                                                                                                                                                                         
    assert.eq(find_response.cursor.firstBatch.length, 1); 
    assert.neq(cursor_id, 0); 
 
    const getmore_response = coll.runCommand({getMore: cursor_id, collection: collname, batchSize: 1});
    assert.eq(getmore_response.cursor.nextBatch.length, 1); 
    assert.eq(getmore_response.cursor.id, 0, "unexpected nonzero cursor id");
}());

Sprint: Query Execution 2021-05-31
Participants:

 Description   

In the slot-based execution engine, when running a find or getMore command with a batchSize that exactly matches the number of documents remaining to return, the command response will return a non-zero cursor id:

> db.twodocs.drop()                                                         
true                            
> db.twodocs.insertMany([{_id: 0}, { _id: 1}])
{ "acknowledged" : true, "insertedIds" : [ 0, 1 ] }
> db.runCommand({find: "twodocs", batchSize: 1})
{
        "cursor" : {
                "firstBatch" : [
                        {
                                "_id" : 0
                        }
                ],
                "id" : NumberLong("5005687836528864628"),
                "ns" : "test.twodocs"
        },
        "ok" : 1
}
> db.runCommand({getMore: NumberLong("5005687836528864628"), collection: "twodocs", batchSize: 1})
{
        "cursor" : {
                "nextBatch" : [
                        {
                                "_id" : 1
                        }
                ],
                "id" : NumberLong("5005687836528864628"),
                "ns" : "test.twodocs"
        },
        "ok" : 1
}

Iterating the cursor once more reveals that it is in fact exhausted. This is a regression from the classic engine, as the classic engine would have reported a cursor id of 0 to obviate the need for the final getMore where no results are returned.

> db.runCommand({getMore: NumberLong("5005687836528864628"), collection: "twodocs", batchSize: 1})
{
        "cursor" : {
                "nextBatch" : [ ],
                "id" : NumberLong(0),
                "ns" : "test.twodocs"
        },
        "ok" : 1
}

The problem also manifests itself with just the find command and an exact batch size matching the result set:

> db.runCommand({find: "twodocs", batchSize: 2})
{
        "cursor" : {
                "firstBatch" : [
                        {
                                "_id" : 0
                        },
                        {
                                "_id" : 1
                        }
                ],
                "id" : NumberLong("9219549160796429098"),
                "ns" : "test.twodocs"
        },
        "ok" : 1
}
> db.runCommand({getMore: NumberLong("9219549160796429098"), collection: "twodocs"})
{
        "cursor" : {
                "nextBatch" : [ ],
                "id" : NumberLong(0),
                "ns" : "test.twodocs"
        },
        "ok" : 1
}

Note that if the final getMore does not include a batch size, the problem doesn't manifest itself:

> db.runCommand({find: "twodocs", batchSize: 1})
{
        "cursor" : {
                "firstBatch" : [
                        {
                                "_id" : 0
                        }
                ],
                "id" : NumberLong("1331269379857468674"),
                "ns" : "test.twodocs"
        },
        "ok" : 1
}
> db.runCommand({getMore: NumberLong("1331269379857468674"), collection: "twodocs"})
{
        "cursor" : {
                "nextBatch" : [
                        {
                                "_id" : 1
                        }
                ],
                "id" : NumberLong(0),
                "ns" : "test.twodocs"
        },
        "ok" : 1
}



 Comments   
Comment by David Storch [ 24/May/21 ]

In general there is no guarantee that the final getMore batch for a cursor is non-empty. This is because the act of determining whether a document is the last one in the result set may require an arbitrary amount of query execution work. This work is not done proactively; when a getMore response batch is complete, that batch should be returned to the client immediately without determining if there are any additional results that will be returned in the next batch.

I suggest closing this ticket as "Works as Designed".

Generated at Thu Feb 08 05:40:54 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.