Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-59419

OperationContext::_ignoreInterrupts can be written to without synchronization by IgnoreInterruptionsGuard destructor

    • Service Arch
    • Fully Compatible
    • ALL
    • v6.3, v6.0, v5.0
    • Service Arch 2023-02-06, Service Arch 2023-02-20
    • 120

      Interruptible::IgnoreInterruptionsGuard calls Interruptible::popIgnoreInterrupts in its destructor, and Interruptible:: pushIgnoreInterrupts in its constructor, which, in the case of an OperationContext, results in  writes to OperationContext::_ignoreInterrupts that are not explicitly synchronized.

       

      The fundamental issue is that OperationContext: ignoreInterrupts can be written to by {pop,push}IgnoreInterrupts without any synchronization by the client-thread via runWithoutInterruption, but it can be read by other threads via getKillStatus. For example, _ignoreInterrupts can be concurrently read in CurOp::reportCurrentOpForClient, resulting in a data race. Since _ignoreInterrupts is a private member of opCtx and running without interruption is done by the Client thread, this should probably be synchronized via the Client lock. Callers of getKillStatus that are not the client thread (such as CurOp::reportCurrentOpForClient) do indeed take the Client lock before reading _ignoreInterrupts already.

      The general semantics for how the client lock protects these sorts of data races is that while the client-thread is entitled to _read certain opCtx data members lock-free, it still needs to hold the client-lock to write those data members. Conversely, non-client-threads can only read an opCtx's data when holding the client lock, and can never write these fields (the exception is _killCode, which other threads can write to, which has its own atomic synchronization).

      The reason there is a data race here is because this contract is broken for _ignoreInterrupts, as the client-thread writes to it without holding the client lock. So I think the 'right way' to solve this is probably to make sure that {push/pop}IgnoreInterrupts holds the Client lock and can only be called by the client-thread. 

            Assignee:
            george.wangensteen@mongodb.com George Wangensteen
            Reporter:
            gregory.noma@mongodb.com Gregory Noma
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated:
              Resolved: