[SERVER-72838] Prevent concurrent direct writes from unsetting kPendingDirectWrite flag Created: 13/Jan/23  Updated: 29/Oct/23  Resolved: 10/Feb/23

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 6.3.0-rc0
Fix Version/s: 7.0.0-rc0, 6.3.0-rc1

Type: Bug Priority: Major - P3
Reporter: Dan Larkin-York Assignee: Fausto Leyva (Inactive)
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
is depended on by SERVER-73533 Refactor the BucketState class to rem... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v6.3
Sprint: Execution Team 2023-02-20, Execution Team 2023-02-06
Participants:

 Description   

If two direct writes to a single time series bucket race with each other, we could end up releasing the bucket for inserts earlier than expected, and corrupt the bucket contents.

Consider the following sequence:

| Update1               | Update2                         | Insert1                         | Insert2                           |
|-----------------------|---------------------------------|---------------------------------|-----------------------------------|
| Write to storage      |                                 |                                 |                                   |
| Set kDirectWriteStart |                                 |                                 |                                   |
|                       | Write to storage                |                                 |                                   |
|                       | Set kDirectWriteStart           |                                 |                                   |
|                       | Abort + unset kDirectWriteStart |                                 |                                   |
|                       |                                 | Reopen/fetch the bucket into BC |                                   |
|                       |                                 | ...                             | Insert into the same bucket in BC |
|                       |                                 |                                 |                                   |
| Commit                |                                 |                                 |                                   |
|                       |                                 |                                 | Read doc updated by Update1       |
|                       |                                 |                                 | Write the stale diff              |             



 Comments   
Comment by Githook User [ 06/Mar/23 ]

Author:

{'name': 'Faustoleyva54', 'email': 'fausto.leyva@mongodb.com', 'username': 'Faustoleyva54'}

Message: SERVER-72838 Prevent concurrent direct writes from unsetting kPendingDirectWrite flag

(cherry picked from commit 32220e43865e7e4a8a47093d55b89312d0a6f4af)
Branch: v6.3
https://github.com/mongodb/mongo/commit/f8765617a69d45a925d720e9e97a001a50b3cd62

Comment by Githook User [ 10/Feb/23 ]

Author:

{'name': 'Faustoleyva54', 'email': 'fausto.leyva@mongodb.com', 'username': 'Faustoleyva54'}

Message: SERVER-72838 Prevent concurrent direct writes from unsetting kPendingDirectWrite flag
Branch: master
https://github.com/mongodb/mongo/commit/32220e43865e7e4a8a47093d55b89312d0a6f4af

Comment by Yuhong Zhang [ 13/Jan/23 ]

Chatted with Dan to take some potential future changes into account for the solution to this problem:

  • We can use the operation id instead of a bit flag to represent the state of a bucket pending direct write. The kPendingDirectWrite state is not and will not be set with any other states at the same time. The BucketState can be converted into a variant of BucketStateFlag and OperationId
  • When we initiate a direct write to a bucket, check its state 
    • If it's not set with an operation id, set with the current operation id
    • Otherwise if the bucket is set with the current operation id, proceed; If with a different operation id, throw a WriteConflictException (A different thread is already trying to write to this bucket.)
  • Unset the operation id on RecoveryUnit commit/abort
Generated at Thu Feb 08 06:22:57 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.