[SERVER-45409] Rollback-via-refetch should wait for aborted two-phase index build threads to exit Created: 07/Jan/20  Updated: 29/Oct/23  Resolved: 30/Jan/20

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 4.3.4

Type: Bug Priority: Major - P3
Reporter: Louis Williams Assignee: Louis Williams
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Related
is related to SERVER-45174 rollback should not abort single phas... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: Execution Team 2020-01-27, Execution Team 2020-02-10
Participants:
Linked BF Score: 33

 Description   

Before rollback, we abort and record all active index builds so that upon completion of rollback, we know which index builds may need to be restarted.

The IndexBuildsCoordinator::onRollback() function does not wait for these index build threads to exit, which opens the possibility for index builds to run concurrently with rollback (not good).

This can cause rollback to fail with errors like "There's already an index with name 'a_1' being built on the collection". This error is not an UnrecoverableRollbackError, so rollback-via-refetch will restart. On its next attempt, it no longer has information about the aborted index builds, and as a result, will not restart them. This leads to index inconsistencies.

In general, any non-fatal error during rollback-via-refetch will result in two-phase index builds not being restarted. We should potentially store these aborted index builds at higher level to be more resilient to this issue.



 Comments   
Comment by Githook User [ 30/Jan/20 ]

Author:

{'username': 'louiswilliams', 'name': 'Louis Williams', 'email': 'louis.williams@mongodb.com'}

Message: SERVER-45409 Rollback should wait for all index builds to complete. Fail rollback-via-refetch on the first attempt if index builds were aborted beforehand
Branch: master
https://github.com/mongodb/mongo/commit/31d27b87be16f2ebb5fd76cd8aef9ab65378def4

Comment by Louis Williams [ 13/Jan/20 ]

Determine whether recoverable rollback is also affected.

Update: recovery-to-stable (recoverable rollback) is not affected, because it relies on the durable catalog state to determine how to restart index builds.

Generated at Thu Feb 08 05:08:42 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.