Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-7373

Improve slow random cursor operations on oplog

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Critical - P2
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.2.14, 4.4.6, WT10.0.1, 5.0.0-rc0
    • Component/s: None
    • Labels:
      None
    • Case:
    • Backport Requested:
      v4.4, v4.2

      Description

      Random cursors can be quite slow on a multi-GB table with very large records.

      A customer has experienced slow startup times in 4.2 that they didn't see in 4.0.  Based on their logs mongod is spending a lot of time iterating a random cursor though the oplog — in one case it takes 15 minutes to perform 993 cursor->next() calls on a ~26GB oplog.  The oplog had only 62590 records, so the average record size is 100s of KB.  

      A simple python script, which I'll attach, shows similar behavior on a large file.  I populate a table with 60K 500 KB records, assigning them increasing keys to mimic the oplog which adds records with increasing timestamps.  When that's done, I create a random cursor and call next() 1000 times.  It takes almost 3 minutes on my virtual workstation.

      Additional notes:

      Oddly increasing the WT cache size from the default (100MB, I think) to 4GB made the cursor iteration increase from ~130 sec to ~170 sec.

      This isn't nearly as slow on a larger machine.  On ocelot-aws (16 cores, 64GB RAM) it only takes 20 sec.

      I've been reproducing this on the develop branch and haven't tried running specifically against 4.2.

      WT-5068 was a change to the random cursor implementation that went into 4.2.  So it's possible this has something to do with that change. (UPDATE: See my comment, below.  WT-5068 definitely introduced this.)

      There is a lot of variability in the different cursor->next() calls.  Some take a couple milliseconds.  Some take half a second. I guess it's not surprising that a random cursor behaves unpredictably.

        Attachments

        1. 00755353.tgz
          8.09 MB
        2. Logs_758841.tgz
          5.82 MB
        3. test_foo.py
          1 kB
        4. test_foo2.py
          3 kB

          Issue Links

            Activity

              People

              Assignee:
              keith.bostic Keith Bostic
              Reporter:
              keith.smith Keith Smith
              Votes:
              1 Vote for this issue
              Watchers:
              32 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: