[SERVER-51255] rollback persists resumable index info before index build thread is cleaned up Created: 30/Sep/20  Updated: 29/Oct/23  Resolved: 08/Oct/20

Status: Closed
Project: Core Server
Component/s: Storage
Affects Version/s: None
Fix Version/s: 4.9.0

Type: Bug Priority: Major - P3
Reporter: Benety Goh Assignee: Benety Goh
Resolution: Fixed Votes: 0
Labels: pm-1344
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
is related to SERVER-46560 Make Abort index build logic determin... Closed
is related to SERVER-51238 index build has incorrect phase after... Closed
is related to SERVER-51008 adjust rollback to resume index build... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: Execution Team 2020-10-19
Participants:
Linked BF Score: 37

 Description   

When an index build is interrupted for shutdown, the index build thread is responsible for persisting the resumable information to disk. This is done in the IndexBuildsCoordinator::_cleanUpTwoPhaseAfterFailure() function.

For rollback, the resumable index build information is written to disk in the thread (typically BackgroundSync) that is stopping the index build before starting the rollback process. The state of the index build thread is unclear at this point. In most cases, the index build is in a valid state for us to extract the resumable index build information

This inconsistency between rollback and shutdown with respect to writing out the resumable index information means that we would either have to:

  • synchronize the shutdown of the index build thread with the rollback abort thread; or
  • have the index build thread write out the resumable index build state under both rollback and shutdown scenarios.


 Comments   
Comment by Githook User [ 06/Oct/20 ]

Author:

{'name': 'Benety Goh', 'email': 'benety@mongodb.com', 'username': 'benety'}

Message: SERVER-51255 rollback persists resumable index build info after joining builder thread
Branch: master
https://github.com/mongodb/mongo/commit/ae16f30da8c3acc89ead3ff6a753b2ad3985121d

Comment by Githook User [ 05/Oct/20 ]

Author:

{'name': 'Benety Goh', 'email': 'benety@mongodb.com', 'username': 'benety'}

Message: SERVER-51255 clean up log message in MultiIndexBlock::_writeStateToDisk()
Branch: master
https://github.com/mongodb/mongo/commit/5ec3484ee93c60e34cd26b336943d1e514b2b835

Comment by Githook User [ 05/Oct/20 ]

Author:

{'name': 'Benety Goh', 'email': 'benety@mongodb.com', 'username': 'benety'}

Message: SERVER-51255 log _runIndexBuildInner() result when joining thread after commit/abort
Branch: master
https://github.com/mongodb/mongo/commit/fb62d034c5cdf7df6f760be0f93cc057c812fd75

Comment by Benety Goh [ 01/Oct/20 ]

Currently index builds are interrupted with the same IndexBuildAborted error code for both resumable (abortIndexBuild oplog entry) and non-resumable (rollback) scenarios. We could consider using a different error code (maybe IndexBuildAbortedForRollback?) from IndexBuildAborted to facilitate the decision making in IndexBuildsCoordinator::_cleanUpTwoPhaseAfterFailure().

Generated at Thu Feb 08 05:24:57 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.