Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-36400

Explicitly destroy the client on exiting the run body of each BackgroundJob

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.0.3, 4.1.2
    • Component/s: Storage
    • Labels:
    • Backwards Compatibility:
      Fully Compatible
    • Operating System:
      ALL
    • Sprint:
      Storage NYC 2018-08-13
    • Linked BF Score:
      3

      Description

      SERVER-34798 requires all the clients to be destroyed before the destruction of ServiceContext. However, WiredTigerCheckpointThread destroys its client asynchronously and could have a race condition with the main thread because in background.cpp:

          {
              // It is illegal to access any state owned by this BackgroundJob after leaving this
              // scope, with the exception of the call to 'delete this' below.
              stdx::unique_lock<stdx::mutex> l(_status->mutex);
              _status->state = Done;
              _status->done.notify_all();
          }
       
          if (selfDelete)
              delete this;
      }
      

      We set the state to be "Done" before the thread_local client gets destroyed because the thread is still running. But setting the state to be "Done" and notifying would unblock the main thread which could go all the way to the destructor of ServiceContext. Therefore, we could have a situation where the client of WTCheckpointThread gets destroyed by its thread after ServiceContext gets destroyed by main thread.

      The way to reproduce BF-10032 is adding a big sleep here.

      The fix should be similar to SERVER-35985: Add a ON_BLOCK_EXIT in the run() function of WTCheckPointThread

      We should check other BackgroundJobs which create clients in their run() function.

        Attachments

          Issue Links

            Activity

              People

              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: