Loading...

XML

Word

Printable

JSON

Type: Improvement
Resolution: Unresolved
Priority: Major - P3
Fix Version/s: None
Affects Version/s: None
Component/s: Querying
Labels:
None

Assigned Teams:

Query Execution
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

Both the Spark and Hadoop connectors have custom code to partition data in a collection so they can be processed externally in parallel.

This requires either SplitVector for non sharded systems or access to query the config database for sharded systems. The permissions to determine the partitions may not be possible in a sharded or hosted MongoDB setup.

Adding a command that could provide the min, max query bounds for splitting a collection into multiple parts would allow any external framework to query in parallel each partition and process in parallel.

is duplicated by

SERVER-25289 Make it possible to select a subset of documents based on the shard key

Closed

is related to

SERVER-28667 Provide a way for the Aggregation framework to query against intervals of a hashed index

Backlog

SERVER-33998 Remove the parallelCollectionScan command

Closed

Assignee:: [DO NOT USE] Backlog - Query Execution
Reporter:: Ross Lawley
Participants:: [DO NOT USE] Backlog - Query Execution, Andrew Doumaux, Charlie Swanson, Geert Bosch, Ross Lawley
Votes:: 2 Vote for this issue
Watchers:: 24 Start watching this issue

Created:: May 24 2016 04:23:31 PM UTC
Updated:: Dec 06 2022 04:24:59 AM UTC

Details

Description

Attachments

Issue Links

Activity

People

Dates

PagerDuty