Operations hang waitingForLock for hours with no yields

XMLWordPrintableJSON

    • Type: Bug
    • Resolution: Duplicate
    • Priority: Major - P3
    • None
    • Affects Version/s: 2.4.1
    • Component/s: None
    • None
    • Environment:
      MongoDB hosted in Azure-based Windows 2012 Server, queries using C# driver, map-reduce using remote mongo.exe
    • Windows
    • Hide

      Can't reproduce outside of our production env, but this seems to happen each time the long MR operation is executed.

      Show
      Can't reproduce outside of our production env, but this seems to happen each time the long MR operation is executed.
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      While running a long mapReduce (on a collection of ~2M records, 8GB disk size), mongo enters a catatonic state where the map reduce operation stops using the local resources (no CPU/Disk activity) but does not complete (when the MR started CPU/Disk resources usage were high).

      In db.currentOp() there are 129 ops running, all waiting for a global write lock, never completing and never yielding. Other queries work, but these queries are "stuck" and will never complete until mongo service is restarted. The first op, with the longest run time is the map op, the rest are unrelated ops to the DB from other apps.

      Notes:

      • I've waited 20 minutes, and ran db.currentOp() again and performed a diff between the outputs, no fields changes except the 'secs_running'. Especially important is that the 'numYields' doesn't change.
      • Server log shows nothing at this state (only MMS connects). At the time when the MR seems to have stopped working there are no errors, the last entry relating to the MR is: Mon Apr 29 08:14:36.921 [conn11315] M/R: (1/3) Emit Progress: 52700/87827 60%

      Attached are:

      • db.currentOp() output (anonymized with ***).
      • db.serverStatus() output
      • Screenshot of MMS
      • mapReduce code (somewhat obfuscated)

        1. serverStatus 11.52.txt
          12 kB
        2. MR_anon.js
          2 kB
        3. MMS 2.PNG
          MMS 2.PNG
          90 kB
        4. MMS 1.PNG
          MMS 1.PNG
          108 kB
        5. currentOp anon.txt
          147 kB

            Assignee:
            Unassigned
            Reporter:
            guy pitelko
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated:
              Resolved: