Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-50454

Avoiding sending the "keyValue" field to drivers on duplicate key error

    • Minor Change
    • v5.0, v4.4, v4.2
    • Query 2020-11-30, Query 2020-12-14, Query 2020-12-28, Query 2021-01-11, Query 2021-01-25, Query Execution 2021-04-05, Query Execution 2021-05-03, Query Execution 2021-05-17, Query Execution 2021-05-31

      When the server raises a duplicate key error, it attaches the value of the duplicate key to the exception internally. When this exception is subsequently propagated across the wire, either to another node in the cluster or to the driver, the duplicate key is serialized in a field called keyValue.

      The value of the duplicate key is consumed for the purposes of the duplicate key error retry logic implemented in the server in SERVER-37124. However, because the value of the key is implemented using the server's generic ErrorExtraInfo, it also ends up being unnecessarily serialized in the error response sent to the driver. This was considered harmless when first implemented, since the driver was expected to simply ignore the keyValue field.

      However, sending the keyValue field across the wire to drivers is not completely benign. If the index has a non-simple collation, then the key may contain ICU collation keys which are invalid UTF-8. Consider the following such example in the shell:

      MongoDB Enterprise > db.c.drop()
      true
      MongoDB Enterprise > db.createCollection("c", {collation: {locale: "en", strength: 2}})
      { "ok" : 1 }
      MongoDB Enterprise > db.runCommand({insert: "c", documents: [{_id: "sample/1"}]})
      { "n" : 1, "ok" : 1 }
      MongoDB Enterprise > db.runCommand({insert: "c", documents: [{_id: "sample/1"}]})
      {
      	"n" : 0,
      	"writeErrors" : [
      		{
      			"index" : 0,
      			"code" : 11000,
      			"keyPattern" : {
      				"_id" : 1
      			},
      			"keyValue" : {
      				"_id" : "M)AG?1\n�\u0014\u0001\f"
      			},
      			"errmsg" : "E11000 duplicate key error collection: test.c index: _id_ collation: { locale: \"en\", caseLevel: false, caseFirst: \"off\", strength: 2, numericOrdering: false, alternate: \"non-ignorable\", maxVariable: \"punct\", normalization: false, backwards: false, version: \"57.1\" } dup key: { _id: \"0x4d2941473f310a8614010c\" }"
      		}
      	],
      	"ok" : 1
      }
      

      Note that here, the keyValue's _id contains illegal UTF-8, which the shell has handled by adding the unicode replacement character. Some drivers, however, may throw an exception when the response contains invalid UTF-8, as opposed to just using the replacement character. This could cause the operation to fail in an unexpected way from the application's perspective, as the application would see a failure to decode BSON due to illegal UTF-8 as opposed to a normal duplicate key error.

            Assignee:
            denis.grebennicov@mongodb.com Denis Grebennicov
            Reporter:
            david.storch@mongodb.com David Storch
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated:
              Resolved: