[SERVER-40477] mongocryptd should error when to-be-encrypted element's type does not match schema Created: 04/Apr/19  Updated: 29/Oct/23  Resolved: 21/May/19

Status: Closed
Project: Core Server
Component/s: Querying
Affects Version/s: None
Fix Version/s: 4.1.12

Type: Bug Priority: Major - P3
Reporter: David Storch Assignee: David Storch
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Documented
is documented by DOCS-12732 Docs for SERVER-40477: mongocryptd sh... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: Query 2019-05-06, Query 2019-05-20, Query 2019-06-03
Participants:

 Description   

In order to ensure that the FLE system works transparently for equality predicates against encrypted fields, we must implement the following constraints:

  • Any field encrypted with the deterministic algorithm must specify exactly one BSON type. This was implemented under SERVER-40627.
  • Queries can only contain equality predicates against encrypted an encrypted field if the field is encrypted with the deterministic algorithm. This was implemented under SERVER-40378.
  • The BSON type of the constant for an equality predicate must match the BSON type specified in the JSON Schema. Implementing this final constraint is the work tracked by this ticket.

Taken together, these restrictions prevent a situation where users can issue a query against an encrypted field such as {ssn: {$eq: NumberInt(12345678)}} and expect matches where ssn can be any of the types {int, long, double, decimal}. One cannot build an application using FLE which queries mixed-type encrypted fields. Instead, when using deterministic encryption to ensure queryability, users must define a schema which names exactly one type for the encrypted field. Furthermore, they must write the query so that any constant in an equality predicate against the encrypted field has matching type. For instance, if a user creates a schema specifying that ssn is deterministically encrypted int, they may not run an equality query such as {ssn: {$eq: NumberLong(12345678)}}, since "long" is not the type specified in the schema.

Original description

BSONElement equality semantics involve a logical comparison function rather than byte-wise equality. Therefore, two equal BSONElements may result in unequal ciphertext after encryption, even with the "Deterministic" encryption algorithm. If we want FLE equality to work transparently, the client should encrypt a KeyString encoding. Decryption would similarly be a two-step process in which we decrypt and then decode the KeyString.

The simplest example of this is integers of different types. The integer 42 can be BSON-encoded as either a NumberDouble, NumberInt, or NumberLong. The actual bytes inside the BSONElement are different for all three cases, yet all three are considered equal.



 Comments   
Comment by David Storch [ 10/Jul/19 ]

ravind.kumar, the encrypt.bsonType keyword does still accept an array of strings. As described by SERVER-40627, exactly one bsonType is required in combination with deterministic encryption, but it can still be specified within a singleton array. Here's an example, running a command through a shell connected directly to mongocryptd:

MongoDB Enterprise > db.runCommand({
...     insert: "collection",
...     documents: [{_id: 1, ssn: "000-00-0000"}],
...     isRemoteSchema: false,
...     jsonSchema: {
...         type: "object",
...         properties: {
...             ssn: {
...                 encrypt: {
...                     algorithm: "AEAD_AES_256_CBC_HMAC_SHA_512-Deterministic",
...                     keyId: [UUID()],
...                     bsonType: ["string"] // <====== singleton array
...                 }
...             }
...         }
...     }
... });
{
	"hasEncryptionPlaceholders" : true,
	"schemaRequiresEncryption" : true,
	"result" : {
		"insert" : "collection",
		"documents" : [
			{
				"_id" : 1,
				"ssn" : BinData(6,"ADgAAAAQYQABAAAABWtpABAAAAAEzrPVITDpS4acm0yi8thS1QJ2AAwAAAAwMDAtMDAtMDAwMAAA")
			}
		],
		"lsid" : {
			"id" : UUID("29f5af41-86ca-45fb-8997-5e32d1e8ff35")
		}
	},
	"ok" : 1
}

More importantly, mixed-type schemas are perfectly legal for the random encryption algorithm. Note the following schema's use of an array for encrypt.bsonType in combination with the random algorithm:

MongoDB Enterprise > db.runCommand({
...     insert: "collection",
...     documents: [{_id: 1, ssn: "000-00-0000"}],
...     isRemoteSchema: false,
...     jsonSchema: {
...         type: "object",
...         properties: {
...             ssn: {
...                 encrypt: {
...                     algorithm: "AEAD_AES_256_CBC_HMAC_SHA_512-Random",
...                     keyId: [UUID()],
...                     bsonType: ["string", "number"] // <====== array of multiple types
...                 }
...             }
...         }
...     }
... });
{
	"hasEncryptionPlaceholders" : true,
	"schemaRequiresEncryption" : true,
	"result" : {
		"insert" : "collection",
		"documents" : [
			{
				"_id" : 1,
				"ssn" : BinData(6,"ADgAAAAQYQACAAAABWtpABAAAAAE+A3mnyM6Smiq/0eGf3zMngJ2AAwAAAAwMDAtMDAtMDAwMAAA")
			}
		],
		"lsid" : {
			"id" : UUID("29f5af41-86ca-45fb-8997-5e32d1e8ff35")
		}
	},
	"ok" : 1
}

Comment by Ravind Kumar (Inactive) [ 08/Jul/19 ]

david.storch just to confirm - I'm looking at the JSONSchema definition in the specs repo compared to the Spec Document  - just wanted to confirm that bsonType no longer accepts an array of strings (specifically that there's no other server ticket we've missed somewhere that added in support for that).

Comment by Githook User [ 21/May/19 ]

Author:

{'email': 'david.storch@10gen.com', 'name': 'David Storch', 'username': 'dstorch'}

Message: SERVER-40477 Fail on type mismatch when producing intent-to-encrypt markings.

Inserts and updates will fail if the value being written to
the server does not comply with the 'bsonType'
specification. Equality comparison to an encrypted field in
reads require deterministic encryption and for exactly one
BSON type to be specified. Mongocryptd will similarly fail
reads when the type of the constant in the query differs
from the type in the schema.
Branch: master
https://github.com/10gen/mongo-enterprise-modules/commit/572c8fa1da73f28882b5c864fe9c0f5e05d2d68b

Comment by Githook User [ 15/May/19 ]

Author:

{'email': 'david.storch@10gen.com', 'name': 'David Storch', 'username': 'dstorch'}

Message: SERVER-40477 Make BSON type set available in schema metadata tree.

This is precursor work which will be built upon in a
subsequent patch. By making the BSON type information
available in the schema tree, future code can validate that
comparisons and writes to encrypted fields are sensible with
respect to BSON type, and can return an error to the client
in type mismatch scenarios.

Prior to this change, we were overloading a single type,
EncryptionMetadata, in order to represent both the parsed
version of what users can specify inside the
'encryptMetadata' keyword and to represent the resolution of
the 'encryptMetadata' chain. This patch separates those
concepts into distinct types called EncryptionMetadata and
ResolvedEncryptionInfo respectively.

The resolved encryption info differs from the encryption
metadata variant in that the 'algorithm' and 'keyId' fields
are non-optional. Furthermore, the resolved metadata, unlike
the heritable metadata, may contain a set of BSON types.
Finally, EncryptionMetadata is an IDL type that has a
serialization to and from BSON, whereas
ResolvedEncryptionInfo has no BSON representation.
Branch: master
https://github.com/10gen/mongo-enterprise-modules/commit/73fabb1d9f80ba54186466fc5cdfa3814fb02139

Comment by Githook User [ 15/May/19 ]

Author:

{'name': 'David Storch', 'username': 'dstorch', 'email': 'david.storch@10gen.com'}

Message: SERVER-40477 Make BSON type set available in schema metadata tree.
Branch: master
https://github.com/mongodb/mongo/commit/8dfd014022f0b9cc136e8a22d788089b743422f5

Comment by Nicholas Zolnierz [ 12/Apr/19 ]

david.storch I've filed SERVER-40627 to handle the restrictions on possible bsonTypes for deterministic encryption, I propose rewording the title of this ticket to indicate that queries over numeric types will fail in mongocryptd if they don't match the bsonType of the schema. Does that sound reasonable? Was there anything I missed for equality semantics?

Generated at Thu Feb 08 04:55:05 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.