[SERVER-50395] Investigate whether can try to resume an index build twice during startup recovery Created: 20/Aug/20 Updated: 27/Oct/23 Resolved: 03/Sep/20 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Index Maintenance |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Task | Priority: | Major - P3 |
| Reporter: | Samyukta Lanka | Assignee: | Benety Goh |
| Resolution: | Works as Designed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||
| Participants: | |||||||||||||||||
| Description |
|
We do an untimestamped write when writing the resume info to disk on a clean shutdown. If the startIndexBuild oplog entry has an optime after the stable timestamp at shutdown, it might be possible to read the resume info and try to resume the index build even though we'll see the oplog again when replaying the oplog (and start the index build there). We should to investigate if this is possible or not. |
| Comments |
| Comment by Benety Goh [ 25/Aug/20 ] |
|
This may fall under the same scenario as restarting an unfinished FCV 4.4+ index build described in the Startup Recovery section of the Architecture Guide. While reconciling the catalog and before replaying the oplog, unfinished index builds would be restarted in the background. If there is a startIndexBuild oplog entry to be replayed during the oplog application phase, the filtering logic within the IndexBuildsCoordinator would remove any in-progress index builds (started during the catalog reconciliation phase). The scenario described in this ticket is possible and not specific to resumable index builds only. However, this scenario should be harmless due to how we process index builds when the builds requested indexes are already in progress. |