[SERVER-45409] Rollback-via-refetch should wait for aborted two-phase index build threads to exit Created: 07/Jan/20 Updated: 29/Oct/23 Resolved: 30/Jan/20 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | 4.3.4 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Louis Williams | Assignee: | Louis Williams |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||
| Operating System: | ALL | ||||||||||||
| Sprint: | Execution Team 2020-01-27, Execution Team 2020-02-10 | ||||||||||||
| Participants: | |||||||||||||
| Linked BF Score: | 33 | ||||||||||||
| Description |
|
Before rollback, we abort and record all active index builds so that upon completion of rollback, we know which index builds may need to be restarted. The IndexBuildsCoordinator::onRollback() function does not wait for these index build threads to exit, which opens the possibility for index builds to run concurrently with rollback (not good). This can cause rollback to fail with errors like "There's already an index with name 'a_1' being built on the collection". This error is not an UnrecoverableRollbackError, so rollback-via-refetch will restart. On its next attempt, it no longer has information about the aborted index builds, and as a result, will not restart them. This leads to index inconsistencies. In general, any non-fatal error during rollback-via-refetch will result in two-phase index builds not being restarted. We should potentially store these aborted index builds at higher level to be more resilient to this issue. |
| Comments |
| Comment by Githook User [ 30/Jan/20 ] |
|
Author: {'username': 'louiswilliams', 'name': 'Louis Williams', 'email': 'louis.williams@mongodb.com'}Message: |
| Comment by Louis Williams [ 13/Jan/20 ] |
|
Determine whether recoverable rollback is also affected. Update: recovery-to-stable (recoverable rollback) is not affected, because it relies on the durable catalog state to determine how to restart index builds. |