[SERVER-81246] FLE WriteConcernError behavior unclear Created: 20/Sep/23 Updated: 06/Feb/24 Resolved: 05/Jan/24 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | 7.3.0-rc0, 7.0.6 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Vishnu Kaushik | Assignee: | Erwin Pe |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||||||||||
| Assigned Teams: |
Server Security
|
||||||||||||||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||||||||||||||
| Operating System: | ALL | ||||||||||||||||||||||||||||||||
| Backport Requested: |
v7.0
|
||||||||||||||||||||||||||||||||
| Sprint: | Security 2023-11-13, Security 2023-11-27, Security 2023-12-11, Security 2023-12-25, Security 2024-01-08 | ||||||||||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||||||||||
| Description |
|
This seems to happen on both mongos and mongod. Here is an insert (non-FLE) that encounters a WriteConcernError (WCE). Note that n: 1 because the write went through, and the WCE is reported in the writeConcernError field.
When using FLE, the WCE is placed into the writeErrors field. I'm not sure how drivers would then interpret the error. Note that the write doesn't go through either (n: 0)
This led me to wonder what happens when an actual error, like DuplicateKeyError shows up along with a WCE. The result is that the WCE is hidden (this is basically the bug from
I'm looking to implement FLE + bulkWrite + WCE handling on mongos and I was looking into the existing behavior and that's when I found this. |
| Comments |
| Comment by Githook User [ 23/Jan/24 ] |
|
Author: {'name': 'Erwin Pe', 'email': 'erwin.pe@mongodb.com', 'username': 'erwee'}Message: (cherry picked from commit 8dfda69bd26722b56656d5cfe31772c28818dc8a) GitOrigin-RevId: cb3b77e0fc9581ca13f110bad6d85fc6d6b899df |
| Comment by Githook User [ 04/Jan/24 ] |
|
Author: {'name': 'Erwin Pe', 'email': 'erwin.pe@mongodb.com', 'username': 'erwee'}Message: GitOrigin-RevId: 8dfda69bd26722b56656d5cfe31772c28818dc8a |
| Comment by Erwin Pe [ 07/Dec/23 ] |
|
Thanks for the clarification, Lingzhi & Vishnu. The fact that the write and the commit went through, yet we report n: 0 is indeed a serious issue, on top of the write concern error being in the writeErrors. Re-opening this ticket for a closer look. |
| Comment by Vishnu Kaushik [ 07/Dec/23 ] |
|
Thanks for explaining Lingzhi. Here's a series of steps that demonstrates why the current behavior is wrong: |
| Comment by Lingzhi Deng [ 07/Dec/23 ] |
|
I don't think the internal transaction API can abort the transaction on writeConcern errors. The internal transaction API acts like a driver where it would automatically retry the commit statement on retryable errors (incl. retryable writeConcern errors). But by that time, it is already too late because the internal transaction API should have already issued the "commit". At that point, the write was already done and the only thing we know is the write didn't satisfy the specified writeConcern. So I think if we return that writeConcern as a writeError and claim nothing was done, that's not necessarily true. A subsequent read could see the previous write even if it failed writeConcern. So I think I agree with Vishnu that we should return any writeConcern errors as writeConcern error, not write error. |
| Comment by Erwin Pe [ 07/Dec/23 ] |
|
As it is currently designed, Queryable Encryption write operations handle the write concern through the transaction API during the internal transaction commit. The transaction API honors the write concern on commit, but if it's not satisfiable, then it aborts the transaction and returns a UnsatisfiableWriteConcern in the effective commit status. The FLE CRUD code, in turn, adds this error in the `writeErrors` section of the response. This is very different from how write concern is handled in regular write operations, where it's the service entry point that waits for the write concern and attaches any WCEs into the final command reply. Normally, writes can still succeed (n > 0) with a WCE, but in FLE2, a write concern error is considered a write error because the entire write is always aborted if the WC can't be satisfied. |