[SERVER-69864] ShardCollection allows nested patterns but updateOne does not Created: 21/Sep/22  Updated: 12/Dec/23

Status: Open
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Task Priority: Major - P3
Reporter: Frederic Vitzikam Assignee: Backlog - Cluster Scalability
Resolution: Unresolved Votes: 0
Labels: cs-subteam1, sharding-nyc-subteam1
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Assigned Teams:
Cluster Scalability
Sprint: Sharding NYC 2023-01-23
Participants:
Story Points: 3

 Description   

Problem

In the following scenario, it seems impossible to update the shard key:

 

use test
sh.enableSharding("test")
sh.shardCollection( "test.coll",  { a: 1, "a.b": 1 })
db.coll.insert({a: { b: 1}})
 
db.coll.updateOne({a: { b: 1}}, {$set: {"a": { "b": 2}}}, { upsert: false })
// "Shard key update is not allowed without specifying the full shard key in the query"
 
db.coll.updateOne({a: { b: 1}, "a.b": 1}, {$set: {"a": { "b": 2}}}, { upsert: false })
// same error
 
db.coll.updateOne({"a.b": 1}, {$set: {"a": { "b": 2}}}, { upsert: false })
// same error

Solution

It is not clear to me if shardCollection should reject such pattern or if the logic used by updateOne to validate the filter has an issue when one field contains the other.

Impact

This was found while designing REP-160 (unlike topologies support for cluster to cluster replication).

Acceptance Criteria

Either shardCollection should reject nested patterns or updateOne should support them. There might be other APIs to fix too (e.g. find).



 Comments   
Comment by Brett Nawrocki [ 24/Jan/23 ]

The issue here isn't nested paths on their own, but rather a compound shard key with overlapping paths. Of the following categories, only the final one is a problem:

  • Nested: { "a.b": 1 }
  • Compound: { "a": 1, "x": 1 }
  • Nested and compound: { "a.b": 1, "x": 1 }
  • Overlapping nested and compound: { "a": 1, "a.b": 1 }

The above test does work if the shard key is merely nested:

mongos> use test
switched to db test
mongos> sh.enableSharding("test")
{
        "ok" : 1,
        "$clusterTime" : {
                "clusterTime" : Timestamp(1674597829, 4),
                "signature" : {
                        "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
                        "keyId" : NumberLong(0)
                }
        },
        "operationTime" : Timestamp(1674597829, 2)
}
mongos> sh.shardCollection( "test.coll",  { "a.b": 1 })
{
        "collectionsharded" : "test.coll",
        "ok" : 1,
        "$clusterTime" : {
                "clusterTime" : Timestamp(1674597836, 29),
                "signature" : {
                        "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
                        "keyId" : NumberLong(0)
                }
        },
        "operationTime" : Timestamp(1674597836, 29)
}
mongos> db.coll.insert({a: { b: 1}})
WriteResult({ "nInserted" : 1 })
mongos> db.coll.updateOne({a: { b: 1}}, {$set: {"a": { "b": 2}}}, { upsert: false })
{ "acknowledged" : true, "matchedCount" : 1, "modifiedCount" : 1 }
mongos> db.coll.find({})
{ "_id" : ObjectId("63d055d1326f27392ea62380"), "a" : { "b" : 2 } }
mongos> 

It seems to me that we should either reject shard keys that have overlapping components like this, or maybe reduce them to their simplified forms.

Comment by Rachita Dhawan [ 07/Nov/22 ]

Re-estimate if the filter logic from update_stage.cpp isn't enough to filter for ShardCollection.

Re-triage if there are existing tests that might break due to changing shardCollection logic.

Comment by Ratika Gandhi [ 28/Oct/22 ]

We will go the way of rejecting nested patterns on shardCollection to prevent users who are trying to shard a collection from running into this problem. 

Generated at Thu Feb 08 06:14:39 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.