Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-40318

Condition variable wait in NamespaceSerializer::lock is not exception safe



    • Fully Compatible
    • ALL
    • v4.0
    • Sharding 2019-04-22
    • 5


      The NamespaceSerializer is essentially an in-memory cache of the distributed lock meant to synchronize sharded metadata operations that must run on the config server primary, like _configsvrCreateCollection and _configsvrDropCollection. Roughly, the class works like this:

      1. Threads wishing to lock a namespace call NamespaceSerializer::lock() which takes a class mutex.
      2. Inside, it checks a map of objects containing a condition variable, a waiters counter, and an inProgress boolean for an existing entry for that namespace.
        1. If there is no entry, one is created with a new condition variable, a waiters counter of 1, and inProgress boolean of true.
        2. If there is one, the thread increments its waiters count and waits on its condition variable for inProgress to be false, setting it to true once it can proceed.
      3. After this, the method returns a ScopedLock object which decrements the waiters, sets inProgress to false, and calls notify_one() on the condition variable in its destructor.
        1. If the waiters counter is 0, the entry for the namespace is removed from the map.

      The condition variable wait and waiters counter increment happens before the ScopedLock object is created and the wait is interruptible, so a request with maxTimeMS (or one that is killed) may throw after increasing the counter but without correspondingly decrementing it in the ScopedLock destructor, so the counter can never reach 0 and the entry for the namespace will never be removed.

      Interestingly, the condition variable's condition will be correct once the ScopedLock the interrupted request was waiting on is destructed (because inProgress is set to false), so the next attempt to lock the serializer should succeed without waiting, but because the destructor uses notify_one, if there was more than one thread waiting on the lock and the interrupted request was the one signaled, the other waiter(s) will hang.




            janna.golden@mongodb.com Janna Golden
            jack.mulrow@mongodb.com Jack Mulrow
            0 Vote for this issue
            5 Start watching this issue