Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-18875

Oplog performance on WT degrades over time after accumulation of deleted items

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Critical - P2 Critical - P2
    • 3.0.5, 3.1.6
    • Affects Version/s: None
    • Component/s: WiredTiger
    • Labels:
    • Fully Compatible
    • ALL
    • Quint Iteration 5, Quint Iteration 6

      Issue Status as of Jul 14, 2015

      ISSUE SUMMARY AND IMPACT
      Capped collection handling in WiredTiger is inefficient because of way that WiredTiger tracks and expires documents in the capped collection.

      WiredTiger uses a specific internal cursor to find the "beginning of the capped collection". Combined with asynchronous deletion of expired capped collection records, this is inefficient for collections with high numbers of inserts because requests have to process large number of expired documents.

      USER IMPACT
      Capped collection performance degrades over time. Note that the oplog is a capped collection, so users running replica sets with WiredTiger may be impacted by this issue even if no other capped collections are used.

      RESOLUTION DETAILS
      WiredTiger now caches the current "first" unexpired document in a capped collection. This change improves performance for all capped collections, but is particular important for the performance of replication because the oplog depends on capped collection performance.

      AFFECTED VERSIONS
      MongoDB 3.0.0 through 3.0.4.

      FIX VERSION
      The fix is included in the 3.0.5 production release.

      Original description

      Over the course of 2-3 days, running a simple insert workload with hammer, the performance of inserts degrades from about 20K/s to about 4K/s.
      This was a single node replica set, writing locally to the master, using --nojournal to avoid intertwining any other potential issues.

      Perf indicates this:

      • 99.99% __curfile_next mongo::WiredTigerRecordStore::cappedDeleteAsNeeded_inlock(mongo::OperationContext*, mongo::RecordId const&)

        1. master-oplog-truncates-timeseries.png
          master-oplog-truncates-timeseries.png
          94 kB
        2. patched-oplog-truncates-timeseries.png
          patched-oplog-truncates-timeseries.png
          98 kB
        3. sad_oplog.png
          sad_oplog.png
          45 kB

            Assignee:
            martin.bligh Martin Bligh
            Reporter:
            martin.bligh Martin Bligh
            Votes:
            0 Vote for this issue
            Watchers:
            20 Start watching this issue

              Created:
              Updated:
              Resolved: