[SERVER-50454] Avoiding sending the "keyValue" field to drivers on duplicate key error Created: 21/Aug/20  Updated: 29/Oct/23  Resolved: 20/May/21

Status: Closed
Project: Core Server
Component/s: Querying, Write Ops
Affects Version/s: None
Fix Version/s: 5.1.0-rc0, 4.4.18, 5.0.13

Type: Improvement Priority: Major - P3
Reporter: David Storch Assignee: Denis Grebennicov
Resolution: Fixed Votes: 0
Labels: qexec-team
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
Duplicate
is duplicated by SERVER-62166 Duplicate key violation causes 'Unabl... Closed
Related
related to SERVER-60298 Explain can include ICU collation key... Closed
related to SERVER-57052 Remove `duplicate_key_error` multiver... Closed
is related to SERVER-46810 Broken E11000 duplicate key error whe... Closed
is related to SERVER-37124 Retry full upsert path when duplicate... Closed
Backwards Compatibility: Minor Change
Backport Requested:
v5.0, v4.4, v4.2
Sprint: Query 2020-11-30, Query 2020-12-14, Query 2020-12-28, Query 2021-01-11, Query 2021-01-25, Query Execution 2021-04-05, Query Execution 2021-05-03, Query Execution 2021-05-17, Query Execution 2021-05-31
Participants:
Case:

 Description   

When the server raises a duplicate key error, it attaches the value of the duplicate key to the exception internally. When this exception is subsequently propagated across the wire, either to another node in the cluster or to the driver, the duplicate key is serialized in a field called keyValue.

The value of the duplicate key is consumed for the purposes of the duplicate key error retry logic implemented in the server in SERVER-37124. However, because the value of the key is implemented using the server's generic ErrorExtraInfo, it also ends up being unnecessarily serialized in the error response sent to the driver. This was considered harmless when first implemented, since the driver was expected to simply ignore the keyValue field.

However, sending the keyValue field across the wire to drivers is not completely benign. If the index has a non-simple collation, then the key may contain ICU collation keys which are invalid UTF-8. Consider the following such example in the shell:

MongoDB Enterprise > db.c.drop()
true
MongoDB Enterprise > db.createCollection("c", {collation: {locale: "en", strength: 2}})
{ "ok" : 1 }
MongoDB Enterprise > db.runCommand({insert: "c", documents: [{_id: "sample/1"}]})
{ "n" : 1, "ok" : 1 }
MongoDB Enterprise > db.runCommand({insert: "c", documents: [{_id: "sample/1"}]})
{
	"n" : 0,
	"writeErrors" : [
		{
			"index" : 0,
			"code" : 11000,
			"keyPattern" : {
				"_id" : 1
			},
			"keyValue" : {
				"_id" : "M)AG?1\n�\u0014\u0001\f"
			},
			"errmsg" : "E11000 duplicate key error collection: test.c index: _id_ collation: { locale: \"en\", caseLevel: false, caseFirst: \"off\", strength: 2, numericOrdering: false, alternate: \"non-ignorable\", maxVariable: \"punct\", normalization: false, backwards: false, version: \"57.1\" } dup key: { _id: \"0x4d2941473f310a8614010c\" }"
		}
	],
	"ok" : 1
}

Note that here, the keyValue's _id contains illegal UTF-8, which the shell has handled by adding the unicode replacement character. Some drivers, however, may throw an exception when the response contains invalid UTF-8, as opposed to just using the replacement character. This could cause the operation to fail in an unexpected way from the application's perspective, as the application would see a failure to decode BSON due to illegal UTF-8 as opposed to a normal duplicate key error.



 Comments   
Comment by Githook User [ 30/Sep/22 ]

Author:

{'name': 'Denis Grebennicov', 'email': 'denis.grebennicov@mongodb.com', 'username': 'denis631'}

Message: SERVER-50454 Avoiding sending the "keyValue" field to drivers on duplicate key error
Branch: v4.4
https://github.com/mongodb/mongo/commit/1f59fbd2131b00d5a3dd403306d52da6c6ac5805

Comment by Githook User [ 16/Sep/22 ]

Author:

{'name': 'Denis Grebennicov', 'email': 'denis.grebennicov@mongodb.com', 'username': 'denis631'}

Message: SERVER-50454 Avoiding sending the "keyValue" field to drivers on duplicate key error
Branch: v5.0
https://github.com/mongodb/mongo/commit/3ed1e1368861952f6215dad0ee71a4ce64f0aebc

Comment by Githook User [ 20/May/21 ]

Author:

{'name': 'Denis Grebennicov', 'email': 'denis.grebennicov@mongodb.com', 'username': 'denis631'}

Message: SERVER-50454 Avoiding sending the "keyValue" field to drivers on duplicate key error
Branch: master
https://github.com/mongodb/mongo/commit/6a731da34ade9ba693879c5d586cfe9d9117250d

Generated at Thu Feb 08 05:22:42 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.