[SERVER-68094] Resharding with custom generated _id fails with projection error Created: 15/Jul/22  Updated: 29/Oct/23  Resolved: 13/Sep/22

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: 5.0.13, 6.0.2, 6.1.0-rc2, 6.2.0-rc0

Type: Bug Priority: Major - P3
Reporter: Rachita Dhawan Assignee: Nandini Bhartiya
Resolution: Fixed Votes: 0
Labels: sharding-nyc-subteam1
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: PNG File image-2022-07-15-12-43-20-321.png     File reshard_coll.js    
Issue Links:
Backports
Depends
Backwards Compatibility: Fully Compatible
Backport Requested:
v6.1, v6.0, v5.0
Sprint: Sharding 2022-08-08, Sharding 2022-08-22, Sharding 2022-09-05, Sharding 2022-09-19
Participants:
Story Points: 3

 Description   

Problem: Using a client generated _id instead of using regular objectId and resharding on it generates a projection error.

Details:

The error seems to be coming from how we create pipeline query(specifically $project) in ReshardingSplitPolicy::createRawPipeline. We hit this code path and append ("_id",0) to the pipeline. It seems like shardKey.hasId() check doesn't work as expected.( I removed this check and resharding worked.)

// Do not project _id if it's not part of the shard key.   if (!shardKey.hasId()) 
{  
projectValBuilder.append("_id", 0);  
}

The pipeline that is created : 

[{"$sample":{"size":0}},{"$project":{"_id._id1":{"$ifNull":["$_id._id1",null]},"_id._id2":{"$ifNull":["$_id._id2",null]},"_id._id3":{"$ifNull":["$_id._id3",null]},"_id":0}},{"$sort":{"_id._id1":1,"_id._id2":1,"_id._id3":1}}] 

 Steps to reproduce:

1. Customer created a collection and sharded on :

{"_id.id1":1,"_id.id2":1} 
db.createCollection("colltest4");
sh.shardCollection("dbtest4.colltest4",{"_id.id1":1,"_id.id2":1})

2. They then inserted a few documents as follows:

mongos> db.colltest4.insert({_id:{_id1:1,_id2:1,_id3:1},a:1,b:1})WriteResult({ "nInserted" : 1 }) mongos> db.colltest4.insert({_id:{_id1:2,_id2:2,_id3:1},a:1,b:1})WriteResult({ "nInserted" : 1 }) mongos> db.colltest4.insert({_id:{_id1:2,_id2:2,_id3:2},a:1,b:1})WriteResult({ "nInserted" : 1 }) mongos> db.colltest4.insert({_id:{_id1:2,_id2:2,_id3:2},a:1,b:1}) 

3. Issued the following Resharding command and got a failure

mongos> db.adminCommand({reshardCollection: "dbtest4.colltest4",key: {"_id.id1":1,"_id.id2":1,"_id.id3":1}})
 
{"ok" : 0,"errmsg" : "Invalid $project :: caused by :: Path collision at _id","code" : 31250,"codeName" : "Location31250","$clusterTime" : {"clusterTime" : Timestamp(1657698648, 13),"signature" : {"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),"keyId" : NumberLong(0)} 

Note: Attaching a js test to reproduce.reshard_coll.js

 



 Comments   
Comment by Githook User [ 14/Sep/22 ]

Author:

{'name': 'nandinibhartiyaMDB', 'email': 'nandini.bhartiya@mongodb.com', 'username': 'nandinibhartiyaMDB'}

Message: SERVER-68094: Use $replaceRoot instead of $project

(cherry picked from commit 015dc2badcafc3238845b0eec3d6084fdff2545c)
Branch: v5.0
https://github.com/mongodb/mongo/commit/8cca6f685d9e430f7470ee5dae96d1c5fcb77036

Comment by Githook User [ 14/Sep/22 ]

Author:

{'name': 'nandinibhartiyaMDB', 'email': 'nandini.bhartiya@mongodb.com', 'username': 'nandinibhartiyaMDB'}

Message: SERVER-68094: Use $replaceRoot instead of $project

(cherry picked from commit 015dc2badcafc3238845b0eec3d6084fdff2545c)
Branch: v6.0
https://github.com/mongodb/mongo/commit/fadf5a97702162cda1642a5d6a4d47f0ece43994

Comment by Githook User [ 08/Sep/22 ]

Author:

{'name': 'nandinibhartiyaMDB', 'email': 'nandini.bhartiya@mongodb.com', 'username': 'nandinibhartiyaMDB'}

Message: SERVER-68094: Use $replaceRoot instead of $project

(cherry picked from commit 015dc2badcafc3238845b0eec3d6084fdff2545c)
Branch: v6.1
https://github.com/mongodb/mongo/commit/f83bf6c4bee428f183e8962c9a2a3fe58fe46f62

Comment by Githook User [ 06/Sep/22 ]

Author:

{'name': 'nandinibhartiyaMDB', 'email': 'nandini.bhartiya@mongodb.com', 'username': 'nandinibhartiyaMDB'}

Message: SERVER-68094: Use $replaceRoot instead of $project
Branch: master
https://github.com/mongodb/mongo/commit/015dc2badcafc3238845b0eec3d6084fdff2545c

Comment by Nandini Bhartiya [ 29/Aug/22 ]

PR: https://github.com/10gen/mongo/pull/7076

Comment by Max Hirschhorn [ 02/Aug/22 ]

Implementation plan is to switch the $project stage in ReshardingSplitPolicy::createRawPipeline() to $replaceRoot. Note that this will likely mean removing the call to dotted_path_support::extractElementsBasedOnTemplate() from ReshardingSplitPolicy::_appendSplitPointsFromSample() because the newRoot object will have already extracted the fields as dotted paths. The $arrayToObject aggregation expression will likely be useful to building up the newRoot object.

We should add C++ test cases to initial_split_policy_test.cpp which exercise dotted paths.

Generated at Thu Feb 08 06:09:52 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.