Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-61309

Fix time-series bucket lock reacquisition logic

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major - P3
    • Resolution: Fixed
    • None
    • 5.2.0, 5.0.5, 5.1.1
    • None
    • Fully Compatible
    • ALL
    • v5.1, v5.0
    • Execution Team 2021-11-15
    • 159

    Description

      Currently we maintain a set of bucket pointers, and we refer to the bucket from a WriteBatch by the pointer. This isn't very robust in the face of pointer re-use by the memory allocator, and given that some situations can result in batches being used after the buckets have been released, we want to ensure we are using the bucket OID as the primary reference rather than the pointer.

      Because bucket reacquisition happens in a number of places, and the non-deterministic nature, it's hard to identify an exact set of circumstances under which we would run into this issue. The linked BFs and HELP tickets should shed some additional light on the possible symptoms. Among them are deadlock, inconsistent state (with similarities to use-after-free bugs), and crashes. The stacktraces will likely show one or more of the following methods:

      mongo::BucketCatalog::_expireIdleBuckets
      mongo::BucketCatalog::BucketAccess::_findOpenBucketThenLock
      mongo::BucketCatalog::BucketAccess::rollover
      mongo::BucketCatalog::_removeBucket
      mongo::BucketCatalog::finish
      mongo::BucketCatalog::_waitToCommitBatch

      Attachments

        Issue Links

          Activity

            People

              dan.larkin-york@mongodb.com Dan Larkin-York
              dan.larkin-york@mongodb.com Dan Larkin-York
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: