[SERVER-24727] Find command returns incorrect number of documents when singleBatch is true and limit is greater than 101 Created: 22/Jun/16  Updated: 07/Nov/18  Resolved: 27/Jun/16

Status: Closed
Project: Core Server
Component/s: Querying
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: J Rassi Assignee: David Storch
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Operating System: ALL
Sprint: Query 17 (07/15/16)
Participants:

 Description   

The find command returns an incorrect number of documents when the "singleBatch" option is set to true and the "limit" option is set to a value greater than 101. In this case, the find command returns 101 documents, whereas it should return the number of documents equal to the specified "limit" value.

This affects all versions of the server since the find command was first introduced (3.2.0). It also affects the use of legacy OP_QUERY reads against mongos (which use the find command internally).

Reproduce as follows:

var st = new ShardingTest({shards: 1});
var collName = "test.foo";
 
var howMany = function(coll) {
    return coll.find().limit(-800).itcount();
}
 
for (var i = 0; i < 1000; ++i) {
    st.s0.getCollection(collName).insert({});
}
 
var mongodOpQuery = new Mongo(st.shard0.host);
mongodOpQuery.forceReadMode("legacy");
assert.eq(800, howMany(mongodOpQuery.getCollection(collName)));  // Passes: expected.
 
var mongosOpQuery = new Mongo(st.s0.host);
mongosOpQuery.forceReadMode("legacy");
assert.eq(800, howMany(mongosOpQuery.getCollection(collName)));  // Fails: unexpectedly returns 101.
 
var mongodFindCommand = new Mongo(st.shard0.host);
mongodFindCommand.forceReadMode("commands");
assert.eq(800, howMany(mongodFindCommand.getCollection(collName)));  // Fails: unexpectedly returns 101.
 
var mongosFindCommand = new Mongo(st.s0.host);
mongosFindCommand.forceReadMode("commands");
assert.eq(800, howMany(mongosFindCommand.getCollection(collName)));  // Fails: unexpectedly returns 101.

Credit goes to jeff.yemin for discovering this issue.



 Comments   
Comment by Rubal Jabbal [ 07/Nov/18 ]

Looks like this was a bug in 3.0 and was fixed in 3.2: Ref: https://jira.mongodb.org/browse/SERVER-24547

Comment by Rubal Jabbal [ 06/Nov/18 ]

@david.storch

Hi! I have been recently doing a deep dive into understanding mongo internals better. 
I came across a query pattern wherein the explain query execution didn't make sense to me. I have been checking multiple sources but nowhere could I find a satisfactory solution.
 
While researching for the same, I came across one ticket in mongodb JIRA https://jira.mongodb.org/browse/SERVER-24727, I saw your response on it, and it really does answer my question. But I still have more questions on it. 
 
Root query being:

db.testcollection.find({"state" : 1, "all" : 1}).explain(true)  

shows nreturned and nexamined 2568 docs (which is correct) while. 

db.testcollection.find({"state" : 1, "all" : 1}).limit(1000).explain(true) 

shows nreturned and nexamined 101 docs

db.testcollection.find({"state" : 1, "all" : 1}).batchsize(1000).explain(true) 

shows nreturned and nexamined 1000 docs

db.testcollection.find({"state" : 1, "all" : 1}).batchSize(1000).limit(900).explain(true) 

shows nreturned and nexamined 900 docs
 
From the above comment: Return at most n documents, but only return a single batch, and use the default batchSize_." Since the default_ batchSize for the initial find command response is 101, the number of documents returned should be 101 in the case that n >= 101.
 
This answers why I see that limit is returning 101 docs whilst no limit is returning all docs (which possibly is in batchSize).
Although I have the following questions: 
 
1. Since default batchsize is 101, shouldn't the no limit query also return 101 docs with a cursor?
2. I am on mongo shell 3.0.15. But in the comment above you mentioned that this behavior is In versions of the server before 3.2.0, there was no differentiation between batchSize and limit; the two were passed to the server using a single field called "_ntoreturn"._
Shouldn't the limit version also return me 1000 docs instead of 101, since I am on 3.0?

Comment by David Storch [ 27/Jun/16 ]

I believe this behavior is intentional and will be closing this ticket as Works as Designed. When the server receives a find command with a limit of n and the singleBatch flag set to true, this means "Return at most n documents, but only return a single batch, and use the default batchSize." Since the default batchSize for the initial find command response is 101, the number of documents returned should be 101 in the case that n >= 101.

This is quite different from a find command where batchSize is n and the singleBatch flag is set to true. This instead means "Return a single batch of up to n documents."

In versions of the server before 3.2.0, there was no differentiation between batchSize and limit; the two were passed to the server using a single field called "ntoreturn". In these versions, cursor.batchSize(-n) and cursor.limit(-n) will both return n documents when n > 101. In contrast, versions 3.2.0 and later will never return more than 101 documents in the cursor.limit(-n) case, which I view as a bug fix.

Comment by J Rassi [ 22/Jun/16 ]

I've set "Backport Requested" to v3.2, as I believe this to be a backport candidate. However, I suspect that the "singleBatch" option is not widely used.

The following patch appears to fix the issue:

diff --git a/src/mongo/db/query/query_request.cpp b/src/mongo/db/query/query_request.cpp
index f42d97d..1cf22da 100644
--- a/src/mongo/db/query/query_request.cpp
+++ b/src/mongo/db/query/query_request.cpp
@@ -907,7 +907,14 @@ void QueryRequest::addMetaProjection() {
 }
 
 boost::optional<long long> QueryRequest::getEffectiveBatchSize() const {
-    return _batchSize ? _batchSize : _ntoreturn;
+    if (_ntoreturn) {
+        return _ntoreturn;
+    }
+    if (_limit.value_or(std::numeric_limits<long long>::max()) <
+        _batchSize.value_or(std::numeric_limits<long long>::max())) {
+        return _limit;
+    }
+    return _batchSize;
 }
 
 }  // namespace mongo

Generated at Thu Feb 08 04:07:15 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.