[SERVER-44477] Map reduce with mode "merge" to an existing sharded collection may drop and recreate the target if no docs exist on the primary shard Created: 07/Nov/19  Updated: 29/Oct/23  Resolved: 08/Jan/20

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 3.6.15, 4.0.13, 4.2.1
Fix Version/s: 3.6.17, 4.2.3, 4.0.15

Type: Bug Priority: Major - P3
Reporter: Nicholas Zolnierz Assignee: Nicholas Zolnierz
Resolution: Fixed Votes: 0
Labels: qopt-team
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Related
related to SERVER-42511 Remove query knob internalQueryUseAgg... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v4.0, v3.6
Sprint: Query 2019-12-02, Query 2019-12-16, Query 2019-12-30, Query 2020-01-13
Participants:

 Comments   
Comment by Githook User [ 14/Jan/20 ]

Author:

{'name': 'Nicholas Zolnierz', 'email': 'nicholas.zolnierz@mongodb.com', 'username': 'nzolnierzmdb'}

Message: SERVER-44477 Use correct collection count in cluster MR when determining whether to drop and reshard target
Branch: v3.6
https://github.com/mongodb/mongo/commit/4f329c0b056c75d67567577773039da8f3114cf1

Comment by Githook User [ 13/Jan/20 ]

Author:

{'name': 'Nicholas Zolnierz', 'email': 'nicholas.zolnierz@mongodb.com', 'username': 'nzolnierzmdb'}

Message: SERVER-44477 Use correct collection count in cluster MR when determining whether to drop and reshard target
Branch: v4.0
https://github.com/mongodb/mongo/commit/0460c5964375a50df62811e36a40fc5abf8c7a5a

Comment by Githook User [ 08/Jan/20 ]

Author:

{'name': 'Nicholas Zolnierz', 'email': 'nicholas.zolnierz@mongodb.com', 'username': 'nzolnierzmdb'}

Message: SERVER-44477 Use correct collection count in cluster MR when determining whether to drop and reshard target
Branch: v4.2
https://github.com/mongodb/mongo/commit/da7de3e73ea35a7c56606ef53cd2069658d02f08

Comment by Esha Maharishi (Inactive) [ 08/Nov/19 ]

Context on how this bug came to be:

If the final output collection is empty on a shard, then at the end of the second phase, that shard will apply this optimization to drop the output collection and rename the temp collection into place. This would be a problem if the output collection is supposed to be sharded, because then the output collection on each shard would end up having a different UUID (each temp collection's UUID would be preserved across the rename).

To work around this, I made the router check up front if the output collection is sharded and empty and, if so, send the UUID to use for the temp collection. This way, the output collection would end up having the same UUID on each shard.  (It was known and accepted that this workaround would not work if writes to the final output collection were happening concurrently with the mapReduce.)

When I implemented this, I didn't consider that the output collection may be empty on some shards but not others, and I didn't catch it because I accidentally made the router run the count only against the primary shard instead of against all shards.

As a result, there are two bugs:

1. (Tracked by this ticket) If the output collection is empty on the primary shard but has data on other shards, the data on the other shards will be lost because the router will drop and recreate the output collection in order to generate a new config.collections entry with a new UUID to send for the temp collections. (This uses a special option to shardCollection to have the config server generate the UUID to use in config.collections and ignore the one generated by the primary shard.)

2. (Tracked by SERVER-44527) If the output collection has data on the primary shard but is empty on some other shard, the router will not send a UUID, so the empty shard will apply the optimization and end up with an output collection whose UUID does not match the UUID on the config server or primary shard.

Comment by Nicholas Zolnierz [ 08/Nov/19 ]

CC esha.maharishi

Comment by Nicholas Zolnierz [ 07/Nov/19 ]

Filing for posterity, this will be fixed by the new implementation.

Generated at Thu Feb 08 05:06:05 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.