Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-18908

Secondaries unable to keep up with primary under WiredTiger

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Critical - P2
    • Resolution: Fixed
    • Affects Version/s: 3.0.3, 3.0.4
    • Fix Version/s: 3.4.0-rc2
    • Component/s: Performance, WiredTiger
    • Labels:
    • Backwards Compatibility:
      Fully Compatible
    • Operating System:
      ALL
    • Sprint:
      Quint Iteration 7, QuInt A (10/12/15), QuInt C (11/23/15), Repl 2016-08-29, Repl 2016-09-19, Repl 2016-10-10, Repl 2016-10-31

      Description

      • hardware: 24 CPUs, 64 GB memory, SSD (all mongods and clients on same machine)
      • start 3-member replica set with following options:

        mongod --oplogSize 50 --storageEngine wiredTiger --nojournal --replSet ...
        

        Note: repros with journal also, ran without journal to rule out that as a cause.

      • Simple small document insert workload: 5-16 threads (number doesn't matter to the repro) each inserting small documents {_id:..., x:0} in batches of 10k

      Replica lag grows unbounded as secondaries process ops at maybe 50-80% the rate of the primary.

      Some stats of note:

      primary

      secondary

      • op rate on secondary is maybe half that on primary
      • ops in flight (i.e. active queues) is much less even on the secondary, although that isn't reflected in the reported op rates
      • secondary is executing far more search near calls, about one per document, vs what appears to be about one every 100 documents on primary

      Will get stack traces.

        Attachments

        1. insert-3.0.4.patch
          3 kB
          Bruce Lucas
        2. insert-3.1.4.patch
          2 kB
          Bruce Lucas
        3. lag-02.html
          991 kB
          Bruce Lucas
        4. lag-03.html
          496 kB
          Bruce Lucas
        5. lag-04.html
          769 kB
          Bruce Lucas
        6. lagC-3.0.4.patch
          18 kB
          Bruce Lucas
        7. lagC-3.1.4.patch
          17 kB
          Bruce Lucas
        8. lagD-3.0.4.patch
          20 kB
          Bruce Lucas
        9. log.png
          60 kB
          Bruce Lucas
        10. pri.png
          100 kB
          Bruce Lucas
        11. search_near.png
          237 kB
          Bruce Lucas
        12. sec.png
          104 kB
          Bruce Lucas

          Issue Links

            Activity

              People

              Assignee:
              redbeard0531 Mathias Stearn
              Reporter:
              bruce.lucas Bruce Lucas
              Participants:
              Votes:
              11 Vote for this issue
              Watchers:
              55 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: