[SERVER-56262] Fix _cappedFirstRecord usage for capped collections Created: 22/Apr/21  Updated: 29/Oct/23  Resolved: 30/Apr/21

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 5.0.0-rc0

Type: Task Priority: Major - P3
Reporter: Gregory Wlodarek Assignee: Gregory Wlodarek
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-80302 capped_large_docs.js is not resilient... Closed
related to SERVER-16049 Replicate capped collection deletes e... Closed
Backwards Compatibility: Fully Compatible
Sprint: Execution Team 2021-05-03, Execution Team 2021-05-17
Participants:

 Description   

capped_large_docs.js and create_capped_collection_maxdocs.js both use theĀ  requires_non_retryable_writes tag due to a rare occurrence where the capacity of the capped collection is exceeded by one. We should investigate the cause of this and fix it.



 Comments   
Comment by Githook User [ 30/Apr/21 ]

Author:

{'name': 'Gregory Wlodarek', 'email': 'gregory.wlodarek@mongodb.com', 'username': 'GWlodarek'}

Message: SERVER-56262 Fix _cappedFirstRecord usage for capped collections
Branch: master
https://github.com/mongodb/mongo/commit/bba00a925c257e4ae8c5a6ec13b66890f8433302

Comment by Geert Bosch [ 27/Apr/21 ]

Just following up on some approaches to fix the cached first record. It absolutely should be possible to correctly maintain the cached first record accurately. I would consider it data corruption if we delete record 11 but not delete record 10. I thought that we serialized writes using the capped lock. However, regardless, while one transaction is in the critical section of deleting a record but not yet having committed the delete, it should prevent others from going ahead and committing transactions. One way to do that is marking the start of the critical section with some atomic flag or special value of the cached first record and only resetting it in the onCommit handler. Any conflicting operation can find out in its OpObserver and abort with a WriteConflictException.

In general, we should not change shared data and undo those changes in onRollback as it can break isolation and leak a partial transaction.

Generated at Thu Feb 08 05:38:48 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.