Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-68438

Fix PrimaryOnlyService race condition with the PrimaryOnlyServiceClientObserver

    • Type: Icon: Bug Bug
    • Resolution: Works as Designed
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • Labels:
      None
    • ALL
    • Hide

      Step-up on a secondary during a tenant migration.

      Show
      Step-up on a secondary during a tenant migration.
    • 131

      There is currently a race condition between the POS and the PrimaryOnlyServiceClientObserver.

      When a new primary steps up, we transition from the kRebuilding to the kRunning state in the POS.

      In this case since the instance starts running before we are able to transition from `kRebuilding` to `kRunning. We create the OperationContext during the `run` of the PrimaryOnlyService, the PrimaryOnlyService will actually kill the OpCtx while being in that transition.

      The reason why the operation context is killed is because during that transition the PrimaryOnlyServiceClientObserver which will register the OperationContext will check the current state and find that the current state is indeed kRebuilding. However the second condition which is to check if `allowOpCtxWhileRebuilding` is set to true will no longer be true due to the 
      AllowOpCtxWhenServiceRebuildingBlock running out of scope and reseting the allowOpCtxWhileRebuilding flag to false.
      We end up in a state where we are rebuilding but no longer are allowing the Operation Context while rebuilding and are not in the kRunning state yet.

      Since the instance starts running before the POS state is able to transition from the `kRebuilding` state to the `kRunning` state, 

            Assignee:
            esha.maharishi@mongodb.com Esha Maharishi (Inactive)
            Reporter:
            mathis.bessa@mongodb.com Mathis Bessa (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

              Created:
              Updated:
              Resolved: