[SERVER-18875] Oplog performance on WT degrades over time after accumulation of deleted items Created: 08/Jun/15  Updated: 22/Sep/15  Resolved: 13/Jul/15

Status: Closed
Project: Core Server
Component/s: WiredTiger
Affects Version/s: None
Fix Version/s: 3.0.5, 3.1.6

Type: Bug Priority: Critical - P2
Reporter: Martin Bligh Assignee: Martin Bligh
Resolution: Done Votes: 0
Labels: WTcc, mms-s
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: PNG File master-oplog-truncates-timeseries.png     PNG File patched-oplog-truncates-timeseries.png     PNG File sad_oplog.png    
Issue Links:
Depends
is depended on by WT-1973 MongoDB changes for WiredTiger 2.7.0 Closed
Duplicate
is duplicated by SERVER-19031 Wired Tiger Insert performance drop o... Closed
is duplicated by SERVER-19136 oplog seems to slow down everything Closed
Related
related to SERVER-19178 In WiredTiger capped collection trunc... Closed
is related to SERVER-18674 Very low throughput during portion of... Closed
is related to SERVER-18677 Throughput drop during transaction pi... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Completed:
Sprint: Quint Iteration 5, Quint Iteration 6
Participants:

 Description   
Issue Status as of Jul 14, 2015

ISSUE SUMMARY AND IMPACT
Capped collection handling in WiredTiger is inefficient because of way that WiredTiger tracks and expires documents in the capped collection.

WiredTiger uses a specific internal cursor to find the "beginning of the capped collection". Combined with asynchronous deletion of expired capped collection records, this is inefficient for collections with high numbers of inserts because requests have to process large number of expired documents.

USER IMPACT
Capped collection performance degrades over time. Note that the oplog is a capped collection, so users running replica sets with WiredTiger may be impacted by this issue even if no other capped collections are used.

RESOLUTION DETAILS
WiredTiger now caches the current "first" unexpired document in a capped collection. This change improves performance for all capped collections, but is particular important for the performance of replication because the oplog depends on capped collection performance.

AFFECTED VERSIONS
MongoDB 3.0.0 through 3.0.4.

FIX VERSION
The fix is included in the 3.0.5 production release.

Original description

Over the course of 2-3 days, running a simple insert workload with hammer, the performance of inserts degrades from about 20K/s to about 4K/s.
This was a single node replica set, writing locally to the master, using --nojournal to avoid intertwining any other potential issues.

Perf indicates this:

  • 99.99% __curfile_next mongo::WiredTigerRecordStore::cappedDeleteAsNeeded_inlock(mongo::OperationContext*, mongo::RecordId const&)


 Comments   
Comment by Githook User [ 28/Aug/15 ]

Author:

{u'username': u'keithbostic', u'name': u'Keith Bostic', u'email': u'keith@wiredtiger.com'}

Message: SERVER-18875: clean up some comments associated with cefeb2f, reorder a
test to do less work during checkpoint reconciliation.
Branch: develop
https://github.com/wiredtiger/wiredtiger/commit/1685742d02f4f39f47b5d4a855674fd6dd49f097

Comment by Githook User [ 29/Jun/15 ]

Author:

{u'username': u'michaelcahill', u'name': u'Michael Cahill', u'email': u'michael.cahill@mongodb.com'}

Message: SERVER-18875 Don't keep deleted pages during a checkpoint. Sync transaction up with WiredTiger 2.6 to ease back-porting, including picking up WT-1912.
Branch: mongodb-3.0
https://github.com/wiredtiger/wiredtiger/commit/3751941f5339b257d9fd9c19879f1a901facfbb6

Comment by Githook User [ 25/Jun/15 ]

Author:

{u'username': u'agorrod', u'name': u'Alex Gorrod', u'email': u'alexander.gorrod@mongodb.com'}

Message: Merge pull request #2028 from wiredtiger/deleted-leaf-leak

SERVER-18875 Clean up deleted pages
Branch: develop
https://github.com/wiredtiger/wiredtiger/commit/8f7da9ac596ed295228a792383bd5c03da205843

Comment by Githook User [ 25/Jun/15 ]

Author:

{u'username': u'michaelcahill', u'name': u'Michael Cahill', u'email': u'michael.cahill@mongodb.com'}

Message: SERVER-18875 Track the checkpoint's session ID (rather than the transaction ID): changing visibility broke an assertion when releasing the checkpoint's transaction snapshot.
Branch: develop
https://github.com/wiredtiger/wiredtiger/commit/35eece43f483f1a715a26556fd1f15a9a0a90ab3

Comment by Githook User [ 25/Jun/15 ]

Author:

{u'username': u'michaelcahill', u'name': u'Michael Cahill', u'email': u'michael.cahill@mongodb.com'}

Message: SERVER-18875 During a checkpoint, don't keep data alive if the checkpoint is the only reader.

In particular, if we see pages marked deleted at the beginning of a capped collection, don't cache those pages if there are no possible readers other than the checkpoint itself.
Branch: develop
https://github.com/wiredtiger/wiredtiger/commit/847262185e60610a91e77e76ace5016c1cb8e736

Comment by Githook User [ 25/Jun/15 ]

Author:

{u'username': u'michaelcahill', u'name': u'Michael Cahill', u'email': u'michael.cahill@mongodb.com'}

Message: SERVER-18875 Don't keep deleted pages around during a checkpoint if the checkpoint is the only potential reader. Also, keep internal pages with deleted-but-not-yet-freed children dirty, so deleted leaf pages can't accumulate.
Branch: develop
https://github.com/wiredtiger/wiredtiger/commit/cefeb2f6449460aff2146c85341438173143e7fa

Generated at Thu Feb 08 03:49:00 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.