Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-16921

WT oplog bottleneck on secondary

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Fixed
    • Affects Version/s: 2.8.0-rc5
    • Fix Version/s: 3.0.0-rc7
    • Component/s: Replication, WiredTiger
    • Labels:
      None
    • Backwards Compatibility:
      Fully Compatible
    • Operating System:
      ALL

      Description

      We converted one secondary to 2.8.0-rc5 at approximately 16:00 UTC on 2015-01-016. The throughput observed during the "oplog catch-up" phases of the initial sync was truly impressive!

      The secondary continued to perform as expected until 2015-01-17 at approximately 11:00 UTC. At this point it seems that the throughput of the secondary dropped enough that it can no longer keep up with the 2.6.5 primary. The replication lag is increasing steadily, but in a linear fashion. Thus, the throughput dropped to X% of the primary load, but that X% has been holding steady since 11:00 - X does not seem to be decreasing.

      There is nothing obvious in the logs (will attach anyway) other than a gradual increase in the frequency of "serverStatus was very slow" messages. serverStatus appears to be slow in the oplog phase. (Presumably, this is symptom, not cause, but mentioning anyway).

            0       0       0
      [cnelson@mms-db-1.ny1 /var/log/mongod-mmsconfig1-prod]$ grep "was very slow" mongod.log | grep "2015-01-17T04" | wc
            0       0       0
      [cnelson@mms-db-1.ny1 /var/log/mongod-mmsconfig1-prod]$ grep "was very slow" mongod.log | grep "2015-01-17T05" | wc
            0       0       0
      [cnelson@mms-db-1.ny1 /var/log/mongod-mmsconfig1-prod]$ grep "was very slow" mongod.log | grep "2015-01-17T06" | wc
            0       0       0
      [cnelson@mms-db-1.ny1 /var/log/mongod-mmsconfig1-prod]$ grep "was very slow" mongod.log | grep "2015-01-17T07" | wc
            1      61     421
      [cnelson@mms-db-1.ny1 /var/log/mongod-mmsconfig1-prod]$ grep "was very slow" mongod.log | grep "2015-01-17T08" | wc
            3     183    1266
      [cnelson@mms-db-1.ny1 /var/log/mongod-mmsconfig1-prod]$ grep "was very slow" mongod.log | grep "2015-01-17T09" | wc
           12     732    5076
      [cnelson@mms-db-1.ny1 /var/log/mongod-mmsconfig1-prod]$ grep "was very slow" mongod.log | grep "2015-01-17T10" | wc
           53    3233   22402
      [cnelson@mms-db-1.ny1 /var/log/mongod-mmsconfig1-prod]$ grep "was very slow" mongod.log | grep "2015-01-17T11" | wc
          147    8967   62167
      [cnelson@mms-db-1.ny1 /var/log/mongod-mmsconfig1-prod]$ grep "was very slow" mongod.log | grep "2015-01-17T12" | wc
          154    9394   65086
      [cnelson@mms-db-1.ny1 /var/log/mongod-mmsconfig1-prod]$ grep "was very slow" mongod.log | grep "2015-01-17T13" | wc
          151    9211   63832
      [cnelson@mms-db-1.ny1 /var/log/mongod-mmsconfig1-prod]$ grep "was very slow" mongod.log | grep "2015-01-17T14" | wc
          142    8662   60027
      [cnelson@mms-db-1.ny1 /var/log/mongod-mmsconfig1-prod]$ grep "was very slow" mongod.log | grep "2015-01-17T15" | wc

        Attachments

        1. capped_as_neeed.png
          capped_as_neeed.png
          122 kB
        2. capped_truncate_parallel.js
          0.8 kB
        3. capped_truncate.js
          1 kB
        4. mmsconfig1.png
          mmsconfig1.png
          313 kB
        5. mmsconfig1-2015-01-18.png
          mmsconfig1-2015-01-18.png
          365 kB
        6. mms-db-1.ny1.10gen.cc-snapshot.png
          mms-db-1.ny1.10gen.cc-snapshot.png
          123 kB
        7. mms-db-1.ny1-small-oplog.png
          mms-db-1.ny1-small-oplog.png
          87 kB
        8. mms-db-1-repl-progress.png
          mms-db-1-repl-progress.png
          79 kB
        9. pri.png
          pri.png
          310 kB
        10. sec.png
          sec.png
          146 kB

          Issue Links

            Activity

              People

              • Votes:
                0 Vote for this issue
                Watchers:
                21 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: