-
Type:
Bug
-
Resolution: Unresolved
-
Priority:
Critical - P2
-
None
-
Component/s: Client Side Encryption
-
None
-
Needed
-
Summary
When performing explicit encryption of a text-indexed field with Queryable Encryption via ClientEncryption.encrypt the case, and possibly diacritic, sensitivity settings are not honoured. This appears to be an issue in libmongocrypt and not the drivers as such.
As an example, I have a DEK that is "b7csKuW8B1zoGeA+JLg3puwpBiMMig/Pk/k707SgFmNa5pQmW5pHT8JKKShQ8Myl7jZ5Hzy2l3oCqqSUgmUDRCxcp2/j7Y7GT/F55dTEjeu5tf4WCZuBZ5qBcBQ7FW1X" and I perform the following:
prefix_text_opts = TextOpts(
prefix={
"strMinQueryLength": 2,
"strMaxQueryLength": 6,
},
case_sensitive=False,
diacritic_sensitive=False,
)
ciphertext_firstname = client_encryption.encrypt(
"Sarah",
algorithm=Algorithm.TEXTPREVIEW,
key_id=firstname_dek["_id"],
contention_factor=0,
text_opts=prefix_text_opts
)
My b.e.s field in the FLE2InsertUpdatePayloadV2 should be a5b6c1ffb119c01f194ead53674ee15d0df0862212c2800a6c4debecbc7543a0, but instead it is 146704c5b43a7cabfd312b62512b55993ddbb0209b55d6a08d8daa4524ddbff1. If I use the string sarah instead then the b.e.s field is correct, proving that the case_sensitive=False is not honoured. Similarly, the b.e.d field should be 56949428f63f5647ccd0821705ddceeee6dc5456c0670e76cfdc5001dece3221 but it is b5b0a72993be17500ccd1a39b46722828f3524b275627172baba25f712919207 unless I change the case sensitivity manually.
I believe the same issue occurs for the prefix tokens (b.p.s and b.p.d) and I assume this also occurs for the other text-indexed types.
Motivation
Who is the affected end user?
Any developer using explicit encryption with text-indexed fields will be affected by this issue and during queries documents will not be returned that should be returned. If a query uses explicit encryption it will only return the manually encrypted documents where the search term matches the case sensitivity settings pre-encryption, the same goes for auto encryption as well.
How does this affect the end user?
The end user will not receive all the documents they expect if performing a $match with $encStrNormalizedEq or the other QE type text searches. This is critical for end-user confidence that this works.
How likely is it that this problem or use case will occur?
Anyone using explicit encryption with text-indexed fields
If the problem does occur, what are the consequences and how severe are they?
Incorrect documents will be returned or there will be missing documents from the query.
Is this issue urgent?
I recommend this be fixed before GA
Is this ticket required by a downstream team?
Unknown
Is this ticket only for tests?
I recommend creating new tests to ensure that the token creation is correct
Acceptance Criteria
Explicit encryption must adhere to the settings in the TextOpts and ideally to the Encrypted Fields Map that the end collection has set.
I have code in Python and Go that demonstrates this issue that I can provide if required.
- split to
-
CDRIVER-6327 QE - Case and diacritic sensitivity not honoured for explicit encryption
-
- Blocked
-
-
CSHARP-6030 QE - Case and diacritic sensitivity not honoured for explicit encryption
-
- Blocked
-
-
CXX-3489 QE - Case and diacritic sensitivity not honoured for explicit encryption
-
- Blocked
-
-
GODRIVER-3904 QE - Case and diacritic sensitivity not honoured for explicit encryption
-
- Blocked
-
-
JAVA-6196 QE - Case and diacritic sensitivity not honoured for explicit encryption
-
- Blocked
-
-
NODE-7578 QE - Case and diacritic sensitivity not honoured for explicit encryption
-
- Blocked
-
-
PHPLIB-1846 QE - Case and diacritic sensitivity not honoured for explicit encryption
-
- Blocked
-
-
PYTHON-5820 QE - Case and diacritic sensitivity not honoured for explicit encryption
-
- Blocked
-
-
RUBY-3875 QE - Case and diacritic sensitivity not honoured for explicit encryption
-
- Blocked
-
-
RUST-2421 QE - Case and diacritic sensitivity not honoured for explicit encryption
-
- Blocked
-