Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-63390

Abort merge on error from OpObserver or FileImporter

    • Type: Icon: Task Task
    • Resolution: Won't Do
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • Serverless
    • 0

      Catch shard merge errors that arise outside of tenant_migration_recipient_service.cpp. E.g. failed to copy file, failed to import in tenant_migration_recipient_op_observer.cpp and tenant_file_importer_service.cpp. These can occur on the recipient primary or secondaries. Inform the recipient primary so it aborts the merge, probably via voteCommitMigrationProgress (which I'm renaming to recipientVoteImportedFiles).

      When the primary aborts the merge, all file importers must stop soon. Probably, the file importer that encountered the error should stop immediately (rather than waiting for the primary to abort the migration).

      Without this work, such failures will cause the migration to time out eventually instead of aborting promptly.

      There are some JS tests (see jstests/replsets/tenant_migration_donor_interrupt_on_stepdown_and_shutdown.js) that have been temporarily disabled until R secondaries have the ability to inform R primaries of errors (for example, in this case, an error during cloning). In the aforementioned test, the donor is shut down and cloning to R secondaries partially fails. Since the R secondaries cannot inform the R primary of a partial failure, the R primary moves on and transitions state to learned filenames, so R secondaries start importing files despite cloning not actually have been completed.

            Assignee:
            backlog-server-serverless [DO NOT USE] Backlog - Server Serverless (Inactive)
            Reporter:
            jesse@mongodb.com A. Jesse Jiryu Davis
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: