Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-34027

Production issue under high load.

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Investigating
    • Priority: Major - P3
    • Resolution: Unresolved
    • Affects Version/s: 3.6.1
    • Fix Version/s: Backlog
    • Component/s: Stability
    • Operating System:
      ALL
    • Sprint:
      Dev Tools 2019-05-06, Dev Tools 2019-04-22

      Description

      We had an issue in production yesturday:

      • at ~16:31 UTC undetected queries starts to intensively read cold data. Read form disks greatly increased, as well as eviction from mongo's page cache. This unexpected load lasts till ~17:12.
        The load is not issue by itself remarkable for report here. But it leads to following bad behavour:
      • at 16:42:44 mongod literally freezes. It doesn't respond to any thing, doesn't write anything to log, doesn't send statistic till 16:43:06.
        I believe it is bug.
        I'm attaching diagnostic data file that covers that period, and screenshot of monitoring for long period (16:20-17:30) and focused on issue (short 16:40-16:45)

      Environment: aws i3.x16large, mondo data path is placed on lvm volume over two NVMe devices.

        Attachments

        1. decommit.png
          decommit.png
          146 kB
        2. diagnostic.tar.gz
          10.05 MB
        3. MongoStat-2018-03-21-long.png
          MongoStat-2018-03-21-long.png
          457 kB
        4. MongoStat-2018-03-21-short.png
          MongoStat-2018-03-21-short.png
          391 kB
        5. SystemStat-2018-03-21-long.png
          SystemStat-2018-03-21-long.png
          492 kB
        6. SystemStat-2018-03-21-short.png
          SystemStat-2018-03-21-short.png
          325 kB

          Issue Links

            Activity

              People

              • Votes:
                1 Vote for this issue
                Watchers:
                17 Start watching this issue

                Dates

                • Created:
                  Updated: