[SERVER-41017] Ability to specify the batch size in data size Created: 06/May/19  Updated: 29/Aug/23

Status: Backlog
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: New Feature Priority: Major - P3
Reporter: Linda Qin Assignee: Backlog - Query Optimization
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Documented
Related
Assigned Teams:
Query Optimization
Participants:
Case:

 Description   

When we run a query on a sharded cluster, if batch size is not specified, the shards will return the results with the default batch size (16MB). For a sharded cluster with a large number of shards (e.g. 80 shards), this could cause high memory usage on the mongos (16MB * 80 = 1.28G).

Currently we can specify the batch size in term of the number of documents. If the average document size is already known, we can specify the batch size to reduce the memory usage in this case. However, in some cases, specifying the batch size by the number of documents might not be very optimal:

  • The document size varies for the collection.
  • Projection is being used in the query, so it's hard to predicate how much data will be returned without first querying the document.

It would be nice to allow specifying the batch size in term of the data size. Also it would be nice if the size based batch limit (the default 16MB) could be customized with a `setParameter`.


Generated at Thu Feb 08 04:56:36 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.