[SERVER-14306] mongos can cause the in-memory sort limit to be hit on shards by requesting more results than needed Created: 19/Jun/14  Updated: 19/Jun/15  Resolved: 18/Dec/14

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: 2.6.0, 2.8.0-rc2
Fix Version/s: 2.6.7, 2.8.0-rc4

Type: Bug Priority: Major - P3
Reporter: Daniel Pasette (Inactive) Assignee: Spencer Brody (Inactive)
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
is related to SERVER-14299 For sharded limit=N queries with sort... Closed
Tested
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Completed:
Participants:

 Description   

A query with both sort and limit, and which does not have an index to provide the sort, can fail on mongos even though it succeeds if you run it directly on each of the shards. Use the steps below to reproduce.

// Use mtools to setup the cluster and populate it with data.
$ mlaunch --sharded 2 --single -v
$ mgenerate '{"foo": {"$string": {"length": 10000}}, "bar": "$number", "baz": 0}' -n 20000
 
// Connect to mongos and shard the collection. Disable the balancer so that the data remains imbalanced,
// as there needs to be enough data on one of the shards to exceed the in-memory sort limit.
$ mongo
MongoDB shell version: 2.8.0-rc2
connecting to: test
mongos> db.version()
2.8.0-rc2
mongos> db.adminCommand({enableSharding: "test"})
{ "ok" : 1 }
mongos> db.adminCommand({shardCollection: "test.mgendata", key: {_id: 1}})
{ "collectionsharded" : "test.mgendata", "ok" : 1 }
mongos> sh.disableBalancing("test.mgendata")
 
// Connect to the shard with most of the data. The query succeeds on the shard until the limit
// becomes 3337 or higher. Then it fails due to hitting a memory limit. This is expected behavior.
$ mongo --port 27018
MongoDB shell version: 2.8.0-rc2
connecting to: 127.0.0.1:27018/test
> db.mgendata.count()
17950
> db.mgendata.find().sort({bar: 1}).limit(2000).itcount()
2000
> db.mgendata.find().sort({bar: 1}).limit(3336).itcount()
3336
> db.mgendata.find().sort({bar: 1}).limit(3337).itcount()
2014-12-10T20:39:17.781-0500 I QUERY    Error: error: {
        "$err" : "Executor error: Overflow sort stage buffered data usage of 33563546 bytes exceeds internal limit of 33554432 bytes",
        "code" : 17144
}
    at Error (<anonymous>)
    at DBQuery.next (src/mongo/shell/query.js:259:15)
    at DBQuery.itcount (src/mongo/shell/query.js:352:14)
    at (shell):1:47 at src/mongo/shell/query.js:259
 
// Connect to mongos and do the same thing. This time we hit the memory limit even
// if the limit is small, which is the bug!
mongos> db.mgendata.find().sort({bar: 1}).limit(500).itcount()
2014-12-10T20:41:55.832-0500 I QUERY    Error: error: {
        "$err" : "getMore executor error: Overflow sort stage buffered data usage of 33563546 bytes exceeds internal limit of 33554432 bytes",
        "code" : 17406
}
    at Error (<anonymous>)
    at DBQuery.next (src/mongo/shell/query.js:259:15)
    at DBQuery.itcount (src/mongo/shell/query.js:352:14)
    at (shell):1:46 at src/mongo/shell/query.js:259

Original repro steps

st = new ShardingTest({ shards: 2, chunkSize: 1, other: { separateConfig: true, nopreallocj: 1 }});
var db = st.s.getDB('test');                                                                         
var mongosCol = db.getCollection('skip');                                                                  
db.adminCommand({ enableSharding: 'test' });                                                         
db.adminCommand({ shardCollection: 'test.skip', key: { x: 1 }});                                     
                                                                                                     
var i = 0;                                                                                           
var filler = new Array(1024).toString();                                                             
// create enough data to exceed 32MB limit
while (i < 32*1024 ) {                                                                               
    var bulk = []; 
    for (j = 0; j < 1024; j++) {                                                                     
        bulk.push({x:i+j, y:i+j, z:filler});                                                           
    }                                                                                                
    mongosCol.insert(bulk);                                                                     
    i += j;                                                                                          
}                                                                                                    
jsTest.log(mongosCol.count() + " documents in " + mongosCol);                                              
                                                                                                     
 
// test that direct connect to shard returns a document just below in-mem sort limit
var shardCol = st.shard0.getDB('test').getCollection('skip');
jsTest.log('test query succeeds directly on shard with skip 30000');
assert.eq(1, shardCol.find().sort({y:1}).skip(30000).limit(1).itcount());
 
// test that exceeding the in-memory sort limit errors on mongod
jsTest.log('test that there's an error on mongos with skip 32000');
assert.throws( function(){ mongosCol.find().sort({y:1}).skip(32000).limit(1).itcount() });
 
// test that exceeding the in-memory sort limit errors on mongos
jsTest.log('test that there's an error on mongos with skip 32000');
assert.throws( function(){ mongosCol.find().sort({y:1}).skip(32000).limit(1).itcount() });
 
// test that below limit should succeed on mongos 
jsTest.log('test query succeeds on mongos with skip 30000');
assert.eq(1, mongosCol.find().sort({y:1}).skip(30000).limit(1).itcount());



 Comments   
Comment by Githook User [ 23/Dec/14 ]

Author:

{u'username': u'stbrody', u'name': u'Spencer T Brody', u'email': u'spencer@mongodb.com'}

Message: SERVER-14306 Increase stability of sharding/in_memory_sort_limit.js test
Branch: master
https://github.com/mongodb/mongo/commit/3f7874101df5c3868f40584a6ac408df1248c4be

Comment by Githook User [ 18/Dec/14 ]

Author:

{u'username': u'stbrody', u'name': u'Spencer T Brody', u'email': u'spencer@mongodb.com'}

Message: SERVER-14306 Make sure mongos never requests more results than needed from the shards
Branch: v2.6
https://github.com/mongodb/mongo/commit/4a1a4010830fdf3b2d737164ce09c669fd9ed738

Comment by Githook User [ 18/Dec/14 ]

Author:

{u'username': u'stbrody', u'name': u'Spencer T Brody', u'email': u'spencer@mongodb.com'}

Message: SERVER-14306 Make sure mongos never requests more results than needed from the shards
Branch: master
https://github.com/mongodb/mongo/commit/26b6aa8dab1d265ad2c20f952ec862858a1fc9fb

Comment by David Storch [ 11/Dec/14 ]

This issue still exists as of 2.8.0-rc2. Updated repro steps using mtools:

// Use mtools to setup the cluster and populate it with data.
$ mlaunch --sharded 2 --single -v
$ mgenerate '{"foo": {"$string": {"length": 10000}}, "bar": "$number", "baz": 0}' -n 20000
 
// Connect to mongos and shard the collection. Disable the balancer so that the data remains imbalanced,
// as there needs to be enough data on one of the shards to exceed the in-memory sort limit.
$ mongo
MongoDB shell version: 2.8.0-rc2
connecting to: test
mongos> db.version()
2.8.0-rc2
mongos> db.adminCommand({enableSharding: "test"})
{ "ok" : 1 }
mongos> db.adminCommand({shardCollection: "test.mgendata", key: {_id: 1}})
{ "collectionsharded" : "test.mgendata", "ok" : 1 }
mongos> sh.disableBalancing("test.mgendata")
 
// Connect to the shard with most of the data. The query succeeds on the shard until the limit
// becomes 3337 or higher. Then it fails due to hitting a memory limit. This is expected behavior.
$ mongo --port 27018
MongoDB shell version: 2.8.0-rc2
connecting to: 127.0.0.1:27018/test
> db.mgendata.count()
17950
> db.mgendata.find().sort({bar: 1}).limit(2000).itcount()
2000
> db.mgendata.find().sort({bar: 1}).limit(3336).itcount()
3336
> db.mgendata.find().sort({bar: 1}).limit(3337).itcount()
2014-12-10T20:39:17.781-0500 I QUERY    Error: error: {
        "$err" : "Executor error: Overflow sort stage buffered data usage of 33563546 bytes exceeds internal limit of 33554432 bytes",
        "code" : 17144
}
    at Error (<anonymous>)
    at DBQuery.next (src/mongo/shell/query.js:259:15)
    at DBQuery.itcount (src/mongo/shell/query.js:352:14)
    at (shell):1:47 at src/mongo/shell/query.js:259
 
// Connect to mongos and do the same thing. This time we hit the memory limit even
// if the limit is small, which is the bug!
mongos> db.mgendata.find().sort({bar: 1}).limit(500).itcount()
2014-12-10T20:41:55.832-0500 I QUERY    Error: error: {
        "$err" : "getMore executor error: Overflow sort stage buffered data usage of 33563546 bytes exceeds internal limit of 33554432 bytes",
        "code" : 17406
}
    at Error (<anonymous>)
    at DBQuery.next (src/mongo/shell/query.js:259:15)
    at DBQuery.itcount (src/mongo/shell/query.js:352:14)
    at (shell):1:46 at src/mongo/shell/query.js:259

Comment by David Storch [ 01/Jul/14 ]

After some further investigation, it looks like this not a duplicate of SERVER-14299 (although it is closely related). SERVER-14299 covers an issue in which mongos can send a getmore to a shard even when it has already retrieved all necessary results from the shard. Even with this bug fixed, however, a memory consumption error can occur through mongos which would not happen if you were to run the same query on the shard.

This issue arises when you have large documents, and thus cannot fit ntoreturn documents into a single batch. Say you have a sharded cluster with one shard. The shard has 40 documents, each about 1 MB in size. Then you deliver a query with .limit(10). The shard should be able to answer this query without hitting the 32 MB memory limit by doing a topK.

Here is what ends up happening:

  1. mongos delivers a query with ntoreturn 10 to the shard. It does a topK sort to find the top 10. However, it only returns the first 4, because mongod enforces that batches are 4 MB or less.
  2. mongos has only 4 of its 10 results, so it issues a getmore. However, it passes ntoreturn==10 in the getmore. The shard returns 4 more documents.
  3. Now mongos has 8 of its 10 results. It issues another getmore with ntoreturn==10. The shard tries to produce results until either the batch has 10 results or the batch is 4 MB in size. In the process of doing so, the mongod switches from doing a topK sort to a full in-memory sort, causing a memory error.

One way to fix this would be to have mongos update ntoreturn for each getmore. The original query would have ntoreturn==10. After getting back 4 results, the first getmore would have ntoreturn==6. Similarly, the second getmore would have ntoreturn==2.

Comment by David Storch [ 30/Jun/14 ]

Confirmed as a duplicate of SERVER-14299.

Generated at Thu Feb 08 03:34:26 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.