[SERVER-14854] Updates that modify the shard key might not produce errors Created: 11/Aug/14  Updated: 22/May/19  Resolved: 22/May/19

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: 2.6.3
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Alexander Komyagin Assignee: Janna Golden
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Operating System: ALL
Sprint: Sharding 2019-06-03
Participants:

 Description   

The way we route our updates it's possible to have an update that is routed to the shard that has no documents matching the 'find' portion of the update. In this case the update operation becomes a successful no-op:

mongos> sh.status()
--- Sharding Status ---
  sharding version: {
	"_id" : 1,
	"version" : 4,
	"minCompatibleVersion" : 4,
	"currentVersion" : 5,
	"clusterId" : ObjectId("53e904dbe1c9b456f8737f94")
}
  shards:
	{  "_id" : "shard0000",  "host" : "AD-MAC10G.local:27031" }
	{  "_id" : "shard0001",  "host" : "AD-MAC10G.local:27032" }
  databases:
	{  "_id" : "admin",  "partitioned" : false,  "primary" : "config" }
	{  "_id" : "my_test_db",  "partitioned" : true,  "primary" : "shard0000" }
		my_test_db.my_test_coll
			shard key: { "a" : 1, "b" : 1 }
			chunks:
				shard0001	1
				shard0000	2
			{ "a" : { "$minKey" : 1 }, "b" : { "$minKey" : 1 } } -->> { "a" : 1, "b" : 1000 } on : shard0001 Timestamp(2, 0)
			{ "a" : 1, "b" : 1000 } -->> { "a" : 1, "b" : 2000 } on : shard0000 Timestamp(2, 2)
			{ "a" : 1, "b" : 2000 } -->> { "a" : { "$maxKey" : 1 }, "b" : { "$maxKey" : 1 } } on : shard0000 Timestamp(2, 3)
	{  "_id" : "test",  "partitioned" : false,  "primary" : "shard0001" }
 
mongos> db.getSisterDB(mydb).getCollection(coll).find()
{ "_id" : ObjectId("53e90503c100dda4430d40fb"), "a" : 1, "b" : 1, "c" : 1, "d" : 1 }
mongos> db.getSisterDB(mydb).getCollection(coll).update({a:1}, {a:2, b:1, c:2, d:2})
WriteResult({ "nMatched" : 0, "nUpserted" : 0, "nModified" : 0 })

The behavior is highly misleading, because shard keys are supposed to be immutable and any application that generates 'bad' updates is likely to have a serious bug in it and hence it must receive an error.



 Comments   
Comment by Janna Golden [ 22/May/19 ]

As of PM-1163, shard key fields are no longer immutable so this is a non-issue.

Comment by Scott Hernandez (Inactive) [ 11/Aug/14 ]

After talking to Greg I better understand the current targeting behavior here.

Since we have to support for "save" semantics where the whole document will be replaced (and must contain the full shard key), we use the replacement document, with the full shard key, for targeting instead of the query (that may only have the _id for example), which leads to the update not hitting the shard with the current document.

One solution discussed would be to target based on both the shard key and replacement doc to see if they result in different shards, which would indicate a bad user update which probably will result in something user wouldn't expect, and should be an error and stopped at the mongos instance.

Comment by Scott Hernandez (Inactive) [ 11/Aug/14 ]

How is this a bug at all since it is consistent with updates on a single instance:

> db.x.find()
{_id:2, a:2}
> db.x.update({_id:1}, {_id:2}) // will error if upsert:true, or with an existing document with _id:1
WriteResult({ "nMatched" : 0, "nUpserted" : 0, "nModified" : 0 })

Sounds like you want static analysis of the update statement independent of the data in the collection... which would be a new feature.

Generated at Thu Feb 08 03:36:10 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.