Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-59888

Deadlock between BucketCatalog::_statesMutex and BucketCatalog::_idleMutex when expiring idle buckets

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Critical - P2 Critical - P2
    • 5.0.3, 5.1.0-rc0
    • Affects Version/s: None
    • Component/s: None
    • None
    • Fully Compatible
    • ALL
    • v5.0
    • Execution Team 2021-09-20

      Consider two concurrent time-series inserts: one which inserts into an existing bucket, and one which expires an idle bucket in order to allocate a new one.

      The former acquires BucketCatalog::_statesMutex inside of BucketCatalog::BucketAccess::_confirmStateForAcquiredBucket and then later acquires BucketCatalog::_idleMutex inside of BucketCatalog::_markBucketNotIdle.

      The latter acquires BucketCatalog::_idleMutex inside of BucketCatalog::_expireIdleBuckets and then later acquires BucketCatalog::_statesMutex inside of BucketCatalog::_removeBucket.

      Deadlock.

      The full BucketCatalog stack traces to reach this deadlock are:

      mongo::BucketCatalog::insert
      mongo::BucketCatalog::BucketAccess::BucketAccess
      mongo::BucketCatalog::BucketAccess::_findOpenBucketThenLock
      mongo::BucketCatalog::BucketAccess::_confirmStateForAcquiredBucket
      mongo::BucketCatalog::_markBucketNotIdle
      

      and

      mongo::BucketCatalog::insert
      mongo::BucketCatalog::BucketAccess::BucketAccess
      mongo::BucketCatalog::BucketAccess::_findOrCreateOpenBucketThenLock
      mongo::BucketCatalog::BucketAccess::_create
      mongo::BucketCatalog::_allocateBucket
      mongo::BucketCatalog::_expireIdleBuckets
      mongo::BucketCatalog::_removeBucket
      

      Attached is a patch which reproduces the deadlock.

            Assignee:
            gregory.noma@mongodb.com Gregory Noma
            Reporter:
            gregory.noma@mongodb.com Gregory Noma
            Votes:
            0 Vote for this issue
            Watchers:
            13 Start watching this issue

              Created:
              Updated:
              Resolved: