Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-29703

Race condition when checking for callback cancellation in ReplicationCoordinator

    • Type: Icon: Bug Bug
    • Resolution: Works as Designed
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: Replication
    • Labels:
    • Replication
    • ALL
    • 0

      There is a race condition that exists when checking for callback cancellation in ReplicationCoordinator (see _doMemberHeartbeat, for example). We check if the callback status is equal to CallbackCancelled, but this is not necessarily sufficient to see if 'cancel' was actually called on the callback handle already.

      Let’s say a callback, with CallbackState C, gets scheduled on the TaskExecutor, to be run at some time in the future. When it is time for that task to run, the ThreadPoolTaskExecutor::runCallback function will be run for C. The ‘runCallback’ function includes the following steps:

            1. If C.canceled==1, then set the status of CallbackArgs, ‘args’, for C to ErrorCodes::CallbackCancelled.
            2. Run task C with CallbackArgs ‘args’

      The problem arises when there is a separate thread that tries to call C.cancel() after step 1 has completed and before step 2 has started. The ThreadPoolTaskExecutor::cancel() method atomically changes C.canceled to a value of 1. But, if at the time of step 1’s execution, C.canceled==0, then the CallbackArgs status will not be set to ErrorCodes::CallbackCancelled. For example, consider the following scenario, with two separate threads:

            Initial State: C.canceled=0
            1. [runCallbackThread] If C.canceled==1, then set the status of CallbackArgs, ‘args’, for C to ErrorCodes::CallbackCancelled.
            2. [otherThread] Set C.canceled to 1
            3. [runCallbackThread] Run C with ‘args’

      This means that in the body of the callback C, checking if cbData.status==ErrorCodes::CallbackCanceled (as we do in ReplicationCoordinatorImpl::_doMemberHeartbeat, for example) is not sufficient, since it might indicate an OK status even if C.cancel() had already been called on the callback. The C.isCanceled() method is what should be checked to see if cancel() was actually called on C, since it checks the value of C.canceled directly.

      Cancellation is a tricky issue, but I think we could, generally, simplify how we check for cancellation to make our lives easier. The definition of a callback being “cancelled” should simply be whether or not the “cancel” method has been called on it some time in the past. To be more precise, it should be defined as the state of the “cancelled” field of CallbackState. Since this field is an atomic integer, all reads and writes to it are serialized, so there will never be an ambiguity about the state of that variable, regardless of how many threads try to access it. There is still the remaining issue of how we choose to make decisions based on that state, but that is a general problem when accessing state that may be read or modified by multiple threads.

      It would probably be prudent to review how we check for cancellation throughout the ReplicationCoordinator, to ensure that this issue does not appear in multiple locations.

            backlog-server-repl [DO NOT USE] Backlog - Replication Team
            william.schultz@mongodb.com William Schultz
            0 Vote for this issue
            4 Start watching this issue