Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-68438

Fix PrimaryOnlyService race condition with the PrimaryOnlyServiceClientObserver

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major - P3
    • Resolution: Works as Designed
    • None
    • None
    • None
    • None
    • ALL
    • Hide

      Step-up on a secondary during a tenant migration.

      Show
      Step-up on a secondary during a tenant migration.
    • 131

    Description

      There is currently a race condition between the POS and the PrimaryOnlyServiceClientObserver.

      When a new primary steps up, we transition from the kRebuilding to the kRunning state in the POS.

      In this case since the instance starts running before we are able to transition from `kRebuilding` to `kRunning. We create the OperationContext during the `run` of the PrimaryOnlyService, the PrimaryOnlyService will actually kill the OpCtx while being in that transition.

      The reason why the operation context is killed is because during that transition the PrimaryOnlyServiceClientObserver which will register the OperationContext will check the current state and find that the current state is indeed kRebuilding. However the second condition which is to check if `allowOpCtxWhileRebuilding` is set to true will no longer be true due to the 
      AllowOpCtxWhenServiceRebuildingBlock running out of scope and reseting the allowOpCtxWhileRebuilding flag to false.
      We end up in a state where we are rebuilding but no longer are allowing the Operation Context while rebuilding and are not in the kRunning state yet.

      Since the instance starts running before the POS state is able to transition from the `kRebuilding` state to the `kRunning` state, 

      Attachments

        Issue Links

          Activity

            People

              esha.maharishi@mongodb.com Esha Maharishi (Inactive)
              mathis.bessa@mongodb.com Mathis Bessa
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: