Inconsistency between the LockManager grantedList and grantedCount

XMLWordPrintableJSON

    • Type: Bug
    • Resolution: Done
    • Priority: Major - P3
    • None
    • Affects Version/s: 3.0.12, 3.2.10, 3.4.0-rc0
    • Component/s: Stability
    • None
    • ALL
    • Hide

      unpredictable, seems to be produce on secondary
      (not sure, 3 times on secondary replica in two week)

      Show
      unpredictable, seems to be produce on secondary (not sure, 3 times on secondary replica in two week)
    • Sharding 2016-09-19, Sharding 2016-10-10, Sharding 2016-10-31
    • 5
    • None
    • 3
    • None
    • None
    • None
    • None
    • None
    • None

      All threads hang on waiting locks. The state of the LockHead indicates that there is inconsistency between the grantedList and grantedCount where there are not granted requests, but the granted counts are non-zero:

      (gdb) p $18
      $38 = {
        resourceId = {
          _fullHash = 2305843009213693953
        },
        grantedList = {
          _front = 0x0,
          _back = 0x0
        },
        grantedCounts = {0, 1, 0, 0, 0},
        grantedModes = 2,
        conflictList = {
          _front = 0x7f08028,
          _back = 0x5902cf628
        },
        conflictCounts = {0, 1490, 0, 0, 1},
        conflictModes = 18,
        partitions = {
          <std::_Vector_base<mongo::LockManager::Partition*, std::allocator<mongo::LockManager::Partition*> >> = {
            _M_impl = {
              <std::allocator<mongo::LockManager::Partition*>> = {
                <__gnu_cxx::new_allocator<mongo::LockManager::Partition*>> = {<No data fields>}, <No data fields>},
              members of std::_Vector_base<mongo::LockManager::Partition*, std::allocator<mongo::LockManager::Partition*> >::_Vector_impl:
              _M_start = 0x7470640,
              _M_finish = 0x7470640,
              _M_end_of_storage = 0x7470680
            }
          }, <No data fields>},
        conversionsCount = 0,
        compatibleFirstCount = 0
      }
      

      Attachment shows stacks of all threads and the output of db.currentOp().

      Here is our cluster info:

      • 3 shards * 3 replica
      • using both range & hash based sharding
      • collection size from 50GB to 500GB

        1. currentOp.out
          910 kB
          xiaost
        2. gdb.withsymbols.out
          6.60 MB
          xiaost
        3. LockMgrInvariants.diff
          14 kB
          Kaloian Manassiev
        4. server_status.txt
          24 kB
          xihui he
        5. stacks.txt
          6.14 MB
          xiaost

            Assignee:
            Kaloian Manassiev
            Reporter:
            xiaost
            Votes:
            1 Vote for this issue
            Watchers:
            13 Start watching this issue

              Created:
              Updated:
              Resolved: