Loading...

XML

Word

Printable

JSON

Type: Improvement
Resolution: Unresolved
Priority: Major - P3
Fix Version/s: None
Affects Version/s: None
Component/s: Aggregation Framework
Labels:
- asya
- neweng
- optimization
- performance

Assigned Teams:

Query Optimization
Backwards Compatibility:
Fully Compatible
Sprint:
QuInt 8 08/28/15, Query 2019-06-17, Query 2019-07-01, Query 2019-07-15, Query 2019-07-29, Query 2019-08-12
Case:
Confidence Status:
None
Work Order:
3
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

On sharded environment, using early grouping, besides the use of an index, it would be nice that we be able to avoid the mongos regrouping process.

I'll try to explain that:

  * result_node1: [
     {
       id: "value1",
       totalcount: 50
     },
     {
       id: "value2",
       totalcount: 100
     },
   ]
 * result_node2: [
     {
       id: "value1",
       totalcount: 60
     }
   ]

The real results(after mongos regroup) must looks like:

 [
     {
       id: "value1",
       totalcount: 110
     },
     {
       id: "value2",
       totalcount: 100
     },
 ]

But, in some cases, mongos regrouping process is nonsense since the grouping key is same as sharding key. So, never got same group key from different shards.

So, the prior example, now looks like:

  * result_node1: [
     {
       id: "value1",
       totalcount: 110
     }
   ]
 * result_node2: [
     {
       id: "value2",
       totalcount: 100
     }
   ]

The real results must looks like:

 [
     {
       id: "value1",
       totalcount: 110
     },
     {
       id: "value2",
       totalcount: 100
     },
 ]

So, the point is mongos regrouping process is a waste of time when you group using same key as sharding key.

depends on

SERVER-42160 $group stages that use a DISTINCT_SCAN do not use a SHARDING_FILTER on sharded collections

Closed

SERVER-72748 Enable feature flag

Closed

SERVER-41750 Refactor renamedPaths() helpers to support renames in either direction

Closed

is duplicated by

SERVER-22912 Query Optimizer

Closed

is related to

SERVER-27115 Track fields renamed by $project in aggregation for index consideration

Closed

SERVER-28942 sort by shard key or prefix of shard key may not require merge before group

Closed

SERVER-55200 DISTINCT_SCAN not used for $sort+$match+$group+$first on sharded collection

Closed

related to

SERVER-92457 Push-down $group on shard key to shards

Closed

SERVER-56583 Push $setWindowFields to shards when shards contain whole partitions

Backlog

(2 is related to, 2 related to)

Assignee:: [DO NOT USE] Backlog - Query Optimization
Reporter:: Daniel Pasette (Inactive)
Participants:: [DO NOT USE] Backlog - Query Optimization, Asya Kamsky, Charlie Swanson, Daniel Pasette, Ian Whalen, Jon Rangel, Samuel García Martínez
Votes:: 11 Vote for this issue
Watchers:: 23 Start watching this issue

Created:: Apr 02 2012 05:46:00 PM UTC
Updated:: Jan 24 2025 10:07:47 AM UTC
Confidence Status Last Update:: 09/Aug/19 8:21 PM

Details

Description

Attachments

Issue Links

Forms

Activity

People

Dates