Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-26171

Gradual Degradation of Performance over days

    • Type: Icon: Bug Bug
    • Resolution: Incomplete
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 3.0.12
    • Component/s: None
    • Labels:
      None
    • Environment:
      Linux on AWS - Sharded Replica Set - 2 Shards, 3-node Replica Sets
    • ALL

      Wired Tiger performance degrades over time:

      I've attached a graph of CPU "idle" time.

      The first 2/3 of the attached graph shows our database primaries working harder each day. The actual number of DB Ops is generally stable over this time period.

      On September 10th, I bounce the primaries to a separate set of servers. These new servers also experience gradual degradation, and the previous primaries (now secondaries) continue to have less idle time compared to the new primaries, and the other set of secondaries.

      On September 13th, I upgraded the servers to larger instances, and reboot a bunch of servers. After a reboot, all servers start behaving normally, but the new masters are now starting to slow down.

      (1) Has anyone seen something like this before?
      (2) How can I help track this down?
      -Mike

        1. mongo.txt
          1 kB
        2. Screen Shot 2016-09-19 at 11.05.49 AM.png
          Screen Shot 2016-09-19 at 11.05.49 AM.png
          88 kB
        3. Screen Shot 2016-10-06 at 7.10.12 PM.png
          Screen Shot 2016-10-06 at 7.10.12 PM.png
          209 kB
        4. Screen Shot 2016-10-06 at 7.10.36 PM.png
          Screen Shot 2016-10-06 at 7.10.36 PM.png
          78 kB
        5. shard1iostat.log
          80.19 MB
        6. shard1ss.tar.gz
          51.95 MB
        7. shard2iostat.log
          80.05 MB
        8. shard2ss.tar.gz
          49.19 MB

            Assignee:
            kelsey.schubert@mongodb.com Kelsey Schubert
            Reporter:
            tewner Michael Tewner
            Votes:
            1 Vote for this issue
            Watchers:
            8 Start watching this issue

              Created:
              Updated:
              Resolved: