[SERVER-33538] mapReduce "replace" on a sharded output collection can lead to UUIDCatalog inconsistencies Created: 28/Feb/18  Updated: 29/Oct/23  Resolved: 22/May/18

Status: Closed
Project: Core Server
Component/s: MapReduce, Sharding
Affects Version/s: 3.6.4
Fix Version/s: 3.6.6, 4.0.0-rc1, 4.1.1

Type: Bug Priority: Major - P3
Reporter: Maria van Keulen Assignee: Janna Golden
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
Documented
is documented by DOCS-11731 Docs for SERVER-33538: mapReduce "rep... Closed
Duplicate
is duplicated by SERVER-33599 Out collection not in UUID catalog wh... Closed
is duplicated by SERVER-35425 After a map reduce an exception Names... Closed
Related
related to SERVER-34539 Re-enable sharded mapReduce concurren... Closed
Backwards Compatibility: Minor Change
Operating System: ALL
Backport Requested:
v4.0
Sprint: Storage NYC 2018-03-26, Sharding 2018-05-21, Sharding 2018-06-04
Participants:
Linked BF Score: 60

 Description   

mapReduce with a sharded output collection assigns the UUID obtained from the config server to the final output collection. mapReduce "replace" will drop the existing output collection, which has the same UUID as the new output collection. Two-phase-drop may cause the dropCollection to finish after the renameCollection finishes, erroneously removing the UUIDCatalog entry for the output collection.



 Comments   
Comment by Githook User [ 20/Jun/18 ]

Author:

{'username': 'jannaerin', 'name': 'jannaerin', 'email': 'golden.janna@gmail.com'}

Message: SERVER-33538 Fix UUID inconsistencies in mapReduce on a sharded output collection
Branch: v3.6
https://github.com/mongodb/mongo/commit/a68b33ebd431c64315fa77f7c67914cb4b24a04a

Comment by Esha Maharishi (Inactive) [ 11/Jun/18 ]

rribeiro, I looked at the logs you attached on SERVER-35425 and responded there.

Comment by Rui Ribeiro [ 11/Jun/18 ]

esha.maharishi 

I understand your solution, but i am not using the option out: "replace", I use the option out: "reduce"

In my situation I do:

                         Pickup 3 Collections (sharded)

                                      ¦

                        Map Reduce (out:reduce, sharded:true)

                                      ¦

                          Ouput Collection(MR 1)

                                      ¦

                       Map Reduce (out:reduce, sharded:true)

                                      ¦

                          Ouput Collection(MR 2)

 

Maybee my issue is variation of this bug:

https://jira.mongodb.org/browse/SERVER-35425?page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel&focusedCommentId=1912268#comment-1912268

 

Comment by Esha Maharishi (Inactive) [ 11/Jun/18 ]

rribeiro, I am not sure (kelsey.schubert?), but if you are running into this problem, you could run a collection drop on the sharded output collection before each mapReduce.

This way, the mapReduce will create a new sharded output collection with a new UUID, rather than re-using the UUID from the existing sharded output collection, so you should not hit this bug.

Comment by Rui Ribeiro [ 11/Jun/18 ]

esha.maharishi 

Thank you for the quick answer.

Do you have any idea, when the next 3.6.x will be released.

Right now, I am facing this problem, and without this fix I can't use 3.6.4.

Cheers

 

Comment by Esha Maharishi (Inactive) [ 11/Jun/18 ]

rribeiro, the plan is to backport the fix to the next 3.6 dot release  

Comment by Rui Ribeiro [ 11/Jun/18 ]

HI

Do I have a way apply this fix in version 3.6.4, or will be just released on version 4.0.0?

 

Thank you

 

Comment by Esha Maharishi (Inactive) [ 06/Jun/18 ]

janna.golden, please review this table? My comments on the cr for the backport are based on this understanding:

3.4 mongos 3.6.x mongos (pre-backport) 3.6.y mongos (with backport) 4.0 mongos
  • Just sends whether the output coll is sharded
  • If output coll is sharded, sends 'finalOutputCollIsSharded'
  • If found UUID (cluster is in FCV 3.6), sends 'shardedOutputCollUUID'
  • If merge/reduce and unsharded collection exists, fails.
  • If merge/reduce and sharded collection exists and is empty, or if replace:
       - drops and re-shards the collection
       - sends 'finalOutputCollIsSharded'
       - sends 'shardedOutputCollUUID' with the new sharded collection's UUID
  • If merge/reduce and unsharded collection exists, fails.
  • If merge/reduce and sharded collection exists and is empty, or if replace:
       - drops and re-shards the collection
       - sends 'shardedOutputCollUUID' with the new sharded collection's UUID
3.4 shard 3.6.x shard 3.6.y shard 4.0 shard
  • Doesn't expect any UUID
  • If received 'finalOutputCollIsSharded' and in FCV 3.6, expects to have received 'shardedOutputCollUUID'
  • If received 'shardedOutputCollUUID', expects the UUID to match the local collection's UUID (if a local collection exists)
  • Uses the sent UUID for the tmp collection (does not matter if rename happens)
  • If received 'finalOutputCollIsSharded' and is in FCV 3.6, expects to have received 'shardedOutputCollUUID'
  • Uses the sent UUID for the tmp collection (assuming rename will happen)
  • Uses the sent UUID for the tmp collection (assuming rename will happen)
Comment by Githook User [ 25/May/18 ]

Author:

{'username': 'jannaerin', 'name': 'jannaerin', 'email': 'golden.janna@gmail.com'}

Message: SERVER-33538 Fix UUID inconsistencies in mapReduce on a sharded output collection

(cherry picked from commit ff092947da81890ff92c427f50623d36d084e58c)
Branch: v4.0
https://github.com/mongodb/mongo/commit/29116784f5da27db0232dd91af056bfd646c109c

Comment by Githook User [ 23/May/18 ]

Author:

{'username': 'jannaerin', 'name': 'jannaerin', 'email': 'golden.janna@gmail.com'}

Message: SERVER-33538 Fix UUID inconsistencies in mapReduce on a sharded output collection
Branch: master
https://github.com/mongodb/mongo/commit/ff092947da81890ff92c427f50623d36d084e58c

Comment by Githook User [ 23/May/18 ]

Author:

{'username': 'jannaerin', 'name': 'jannaerin', 'email': 'golden.janna@gmail.com'}

Message: SERVER-33538 Fix UUID inconsistencies in mapReduce on a sharded output collection

(cherry picked from commit b69e6725325aaaae4fcca7563bf6428837ab7767)
Branch: v4.0
https://github.com/mongodb/mongo/commit/6851604955969132ccf8521c1a6cfac7ec7ae2f7

Comment by Janna Golden [ 23/May/18 ]

This was committed with the wrong server ticket number, commit is above.

Comment by Janna Golden [ 23/May/18 ]

Author:

{'username': 'jannaerin', 'name': 'jannaerin', 'email': 'golden.janna@gmail.com'}

Message: SERVER-33639 Fix UUID inconsistencies in mapReduce on a sharded output collection
Branch: master
https://github.com/mongodb/mongo/commit/b69e6725325aaaae4fcca7563bf6428837ab7767

Comment by Esha Maharishi (Inactive) [ 08/May/18 ]

Thanks asya .

Also, I just discussed with Asya that dropping the output collection from the cluster before starting the second phase adds a "window" where queries on the output collection can see a mix of

  • the old collection's data
  • empty results
  • the new collection's data

(because each shard will return data from one of these categories). Without this change, seeing "empty results" from any shard was not possible.

We discussed that this is preferable to leaving the crash in the UUIDCatalog.

Comment by Asya Kamsky [ 08/May/18 ]

> unless mapReduce with sharded out can only have _id as the shard key?

Yes, we only allow (and work correctly) when output collection is sharded by _id.

 

Comment by Esha Maharishi (Inactive) [ 20/Apr/18 ]

Though, the shardCollection logic in dropAndShardCollection would need to not create the collection on the primary shard, which may be problematic because it won't create the shard key index... unless mapReduce with sharded out can only have _id as the shard key?

Comment by Esha Maharishi (Inactive) [ 20/Apr/18 ]

One sharding solution could be:

  • add a "dropAndShardCollection" command on the config server. This command will hold distlocks across dropCollection and shardCollection logic, and return the new sharded collection's UUID. Mongos can send this new UUID in its sharded finish command, so that the asynchronous drop from dropAndShardCollection won't conflict with the rename from the sharded finish command.
  • for $out with replace, make the sharded finish command fail if the output collection already exists

We could take this one step further

  • make mongos hold the distlock across "dropAndShardCollection" as well as the sharded finish phase, which fixes related UUID bug SERVER-31716.

As schwerin noted, doing either of these and backporting them does not guarantee that the crash will not occur in a mixed-version cluster v3.6/v4.0 cluster (unless all the v3.6 nodes have been upgraded to the 3.6 dot release that has the fix).

Comment by Maria van Keulen [ 19/Apr/18 ]

I am assigning this to Sharding so they can investigate potential sharding-level fixes for this bug.

Comment by Maria van Keulen [ 28/Feb/18 ]

One fix for this problem is to use immediate (not two-phase) collection drops during the "replace" stage and restore the UUIDCatalog entry for the output collection at the end of the "replace" stage before releasing the lock.

Generated at Thu Feb 08 04:33:44 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.