TenantOplogBatcher should transition to complete when it fails to schedule the next batch and is shutting down

XMLWordPrintableJSON

    • Fully Compatible
    • ALL
    • Repl 2021-02-22
    • 0
    • None
    • 3
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      In TenantOplogApplierTest, when the destructor of the TenantOplogBatcher runs, it will join the underlying TenantOplogBatcher in the TenantOplogApplier. The TenantOplogBatcher _scheduleNextBatch with self = shared_from_this() and so the TenantOplogBatcher destruction may run after the test tearDown() where we also shut down the _executor. Therefore, this job could fail to schedule because the executor has been shut down. When shutting down a TenantOplogBatcher, we assume that "if _batchRequested was true, we handle the _transitionToComplete when it becomes false". We do so when the next batch is scheduled successfully, but we break that assumption if we fail to schedule the next batch. So I think we should also call _transitionToComplete if the TenantOplogBatcher is shutting down on job schedule failures.

      I believe this can happen in production code as well because on recipient, the TenantOplogBatcher could also schedule jobs after the TenantMigrationRecipientService's _scopedExecutor has been shut down (e.g. on stepDown). So we would still like to transition the TenantOplogBatcher to complete on job schedule failures if the TenantOplogBatcher is shutting down.

              Assignee:
              Lingzhi Deng
              Reporter:
              Lingzhi Deng
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated:
                Resolved: