Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-45409

Rollback-via-refetch should wait for aborted two-phase index build threads to exit

    XMLWordPrintableJSON

Details

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Major - P3 Major - P3
    • 4.3.4
    • None
    • None
    • None
    • Fully Compatible
    • ALL
    • Execution Team 2020-01-27, Execution Team 2020-02-10
    • 33

    Description

      Before rollback, we abort and record all active index builds so that upon completion of rollback, we know which index builds may need to be restarted.

      The IndexBuildsCoordinator::onRollback() function does not wait for these index build threads to exit, which opens the possibility for index builds to run concurrently with rollback (not good).

      This can cause rollback to fail with errors like "There's already an index with name 'a_1' being built on the collection". This error is not an UnrecoverableRollbackError, so rollback-via-refetch will restart. On its next attempt, it no longer has information about the aborted index builds, and as a result, will not restart them. This leads to index inconsistencies.

      In general, any non-fatal error during rollback-via-refetch will result in two-phase index builds not being restarted. We should potentially store these aborted index builds at higher level to be more resilient to this issue.

      Attachments

        Activity

          People

            louis.williams@mongodb.com Louis Williams
            louis.williams@mongodb.com Louis Williams
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: