[SERVER-55230] Investigate using ranged truncation for capped collection deletion Created: 16/Mar/21  Updated: 06/Dec/22  Resolved: 18/Mar/21

Status: Closed
Project: Core Server
Component/s: Storage
Affects Version/s: None
Fix Version/s: None

Type: Task Priority: Major - P3
Reporter: Gregory Wlodarek Assignee: Backlog - Storage Execution Team
Resolution: Won't Do Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-55156 Move capped collection responsibiliti... Closed
Assigned Teams:
Storage Execution
Participants:

 Description   

SERVER-55156 removes the capped collection truncation optimization here in an effort to simplify the work needed for SERVER-16049. Using a ranged truncation may prove useful for workloads where the insertion of one document requires the deletion of multiple documents.

A few things that we'd need to make this a reality that came to mind:

  • Implement a new record store function to do ranged truncation on a collection.
    • Similiar to cappedTruncateAfter().
  • Create a new oplog entry type for replicating these truncation operations.


 Comments   
Comment by Louis Williams [ 16/Mar/21 ]

We don't have any tests that exercise capped collection truncation, but this comment provides an explanation for when we expect it to be exercised:

// For a capped collection, the number of documents that can be removed directly, rather than via a
// truncate.  The value has been determined somewhat by experimentation, but there's no clear win
// for all situations.  Setting it to a lower number makes individual remove calls happen, rather
// than truncate, only when small numbers of documents are inserted at a time. Making it larger
// makes larger chunks of documents inserted at time follow the remove path in preference to the
// truncate path.  Using direct removes is more likely to be a benefit when inserts are spread over
// many capped collections, since avoiding a truncate avoids having to get a second cursor, which
// may not be already cached in the current session. The benefit becomes less pronounced if the
// capped collections are more actively used, or are used in small number of sessions, as multiple
// cursors will be available in the needed session caches.
static int kCappedDocumentRemoveLimit = 3;

Despite losing out on the truncation benefits, we believe that explicitly deleting documents will provide a smoother capped collection experience overall. Primaries and secondaries will always perform the same operations, and primaries and secondaries will continue to serialize capped operations.

Generated at Thu Feb 08 05:35:52 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.