[SERVER-27239] Remove the usage of distributed collection lock during map/reduce Created: 30/Nov/16  Updated: 06/Dec/22  Resolved: 24/Jul/21

Status: Closed
Project: Core Server
Component/s: MapReduce, Sharding
Affects Version/s: None
Fix Version/s: None

Type: Task Priority: Major - P3
Reporter: Kaloian Manassiev Assignee: [DO NOT USE] Backlog - Sharding Team
Resolution: Done Votes: 0
Labels: remove-distributed-lock-fallout
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Assigned Teams:
Sharding
Participants:

 Description   

Map/reduce with a sharded output collection performs an optimization where the final reduce step is omitted and the previous reduce step is done in parallel on all shards. This is done by creating an empty sharded collection and spreading the chunks so they are co-located with the data to be reduced based on the shard key, writing all output locally to an empty temporary collection and then renaming the temporary collection to the name of the output collection.

This process only works if no chunks of the output collection move around while the output is being written and is protected through the usage of the collection distributed lock.

This task is to get rid of this reliance on the collection distributed lock.



 Comments   
Comment by Kaloian Manassiev [ 24/Jul/21 ]

asya, this is correct, but not because it's aggregation, but because it never creates a new sharded collection. Closing as Gone Away.

Comment by Asya Kamsky [ 23/Jul/21 ]

kaloian.manassiev this can probably be closed since MR is now aggregation?

Comment by Randolph Tan [ 11/Dec/18 ]

I also realized that the dist lock also serves another purpose besides making sure that chunks don't move around. It serializes concurrent mapReduce outputting to the same sharded collection. Because of how merge uses a temp collection to perform the reduce and drop the original collection, mapReduce outputting to the same collection will step on each other's toes if they are allowed to run at the same time.

Generated at Thu Feb 08 04:14:34 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.