[SERVER-19551] Keep "milestones" against WT oplog to efficiently remove old records Created: 23/Jul/15  Updated: 29/Jun/21  Resolved: 11/Sep/15

Status: Closed
Project: Core Server
Component/s: Storage
Affects Version/s: None
Fix Version/s: 3.1.8

Type: Improvement Priority: Major - P3
Reporter: Martin Bligh Assignee: Max Hirschhorn
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Duplicate
is duplicated by SERVER-17033 Improve performance for bulk insert i... Closed
Related
related to SERVER-20529 WiredTiger allows capped collection o... Closed
related to SERVER-20738 Oplog stones does not enforce ascendi... Closed
related to SERVER-55821 remove next_random_sample_size=1000 c... Closed
Backwards Compatibility: Minor Change
Sprint: Quint Iteration 7, QuInt 8 08/28/15, Quint 9 09/18/15
Participants:

 Description   

Keep "milestones" against the oplog to efficiently remove the old records using WT_SESSION::truncate() when the collection grows beyond its desired maximum size. AKA oplog stones.

The stones represent logical markers against the oplog that are used as truncation points. When a record is inserted, its size is added to the stone being filled. If the size of the stone exceeds the threshold, then a new stone is cut. If the number of stones exceeds its threshold (between 10 and 100), then the background thread for the oplog is signaled to delete the records represented by the oldest stone. The thresholds are determined based on the size of the oplog.

The stones are not persisted, so new stones are chosen at startup based on the records in the oplog. For small-sized oplogs or those not containing many records, the entire oplog is scanned to compute the stones to use. This is done simply by packing records into the stone until the threshold is exceeded.

For larger oplogs or those with many records (>20,000), records are oversampled (by a factor of 10) from the oplog at random using a WiredTigerRecordStore::RandomCursor. Samples are then chosen such that they are expected to be near the right boundary of the logical section. As the oplog is truncated, the error in this estimation is reduced because the actual size of newly created stones is known with greater certainty.

Changing the size of a record in the live oplog is no longer supported.



 Comments   
Comment by Githook User [ 11/Sep/15 ]

Author:

{u'username': u'visemet', u'name': u'Max Hirschhorn', u'email': u'max.hirschhorn@mongodb.com'}

Message: SERVER-19551 Oplog stones.

Keep "milestones" against the oplog to efficiently remove the old
records using WT_SESSION::truncate() when the collection grows beyond
its desired maximum size.
Branch: master
https://github.com/mongodb/mongo/commit/188a6021d72322f1f45adbbe0d0973686deaf8f8

Comment by Githook User [ 04/Aug/15 ]

Author:

{u'username': u'martinbligh', u'name': u'Martin Bligh', u'email': u'mbligh@mongodb.com'}

Message: SERVER-19551: Revert sizeStorer updates only on commited - the capped collection code relies on this being incremented pre-commit
Branch: master
https://github.com/mongodb/mongo/commit/84182ff1575cbe868a89e7209f12ca665f4bda19

Comment by Githook User [ 04/Aug/15 ]

Author:

{u'username': u'martinbligh', u'name': u'Martin Bligh', u'email': u'mbligh@mongodb.com'}

Message: SERVER-19551: Fix up the sizeStorer to update only on commit, not opportunistically, then revert on rollback
Branch: master
https://github.com/mongodb/mongo/commit/769c713f854427f0feec79d2ef5960c5cb6ff49c

Generated at Thu Feb 08 03:51:21 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.