Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-9500

Operations hang waitingForLock for hours with no yields

    XMLWordPrintableJSON

Details

    • Icon: Bug Bug
    • Resolution: Duplicate
    • Icon: Major - P3 Major - P3
    • None
    • 2.4.1
    • None
    • None
    • MongoDB hosted in Azure-based Windows 2012 Server, queries using C# driver, map-reduce using remote mongo.exe
    • Windows
    • Hide

      Can't reproduce outside of our production env, but this seems to happen each time the long MR operation is executed.

      Show
      Can't reproduce outside of our production env, but this seems to happen each time the long MR operation is executed.

    Description

      While running a long mapReduce (on a collection of ~2M records, 8GB disk size), mongo enters a catatonic state where the map reduce operation stops using the local resources (no CPU/Disk activity) but does not complete (when the MR started CPU/Disk resources usage were high).

      In db.currentOp() there are 129 ops running, all waiting for a global write lock, never completing and never yielding. Other queries work, but these queries are "stuck" and will never complete until mongo service is restarted. The first op, with the longest run time is the map op, the rest are unrelated ops to the DB from other apps.

      Notes:

      • I've waited 20 minutes, and ran db.currentOp() again and performed a diff between the outputs, no fields changes except the 'secs_running'. Especially important is that the 'numYields' doesn't change.
      • Server log shows nothing at this state (only MMS connects). At the time when the MR seems to have stopped working there are no errors, the last entry relating to the MR is: Mon Apr 29 08:14:36.921 [conn11315] M/R: (1/3) Emit Progress: 52700/87827 60%

      Attached are:

      • db.currentOp() output (anonymized with ***).
      • db.serverStatus() output
      • Screenshot of MMS
      • mapReduce code (somewhat obfuscated)

      Attachments

        1. currentOp anon.txt
          147 kB
        2. MMS 1.PNG
          MMS 1.PNG
          108 kB
        3. MMS 2.PNG
          MMS 2.PNG
          90 kB
        4. MR_anon.js
          2 kB
        5. serverStatus 11.52.txt
          12 kB

        Activity

          People

            Unassigned Unassigned
            gpgemini guy pitelko
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: