[SERVER-41965] Change repair to only rebuild indexes on repaired collections Created: 27/Jun/19 Updated: 29/Oct/23 Resolved: 30/Jan/20 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | 4.3.4 |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Louis Williams | Assignee: | Daniel Ernst |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | execution_intern_2019 | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||
| Sprint: | Execution Team 2019-12-16, Execution Team 2019-12-30, Execution Team 2020-01-13, Execution Team 2020-01-27, Execution Team 2020-02-10, Execution Team 2019-12-30 | ||||||||||||||||
| Participants: | |||||||||||||||||
| Case: | (copied to CRM) | ||||||||||||||||
| Description |
|
At present, --repair will unconditionally rebuild all indexes. There should be a parameter that disables this behavior and only rebuilds indexes on collections that have been salvaged and may have modified data or collections with indexes that fail validation. Since index builds are very slow, this would be helpful for very large installations that only need to repair a single collection out of many. |
| Comments |
| Comment by Githook User [ 30/Jan/20 ] |
|
Author: {'name': 'Daniel Ernst', 'email': 'daniel.ernst@mongodb.com'}Message: |
| Comment by Eric Milkie [ 28/Oct/19 ] |
|
Need to confirm how repair works today and if we can make this more automatic, while still avoiding rebuilding indexes unnecessary. |
| Comment by Dianna Hohensee (Inactive) [ 13/Aug/19 ] |
|
To contribute another factor in this discussion in case it becomes relevant: simultaneous index builds will need behavior such that we default to not rebuilding in-progress index builds, unless corruption is found, when we know we're a replica set member. |
| Comment by Louis Williams [ 13/Aug/19 ] |
|
milkie, I agree. I think we should have --repair perform "validate" for each collection, and then only rebuild corrupt indexes or indexes on salvaged collections. I think it would still be valuable to provide a parameter to only run validate+index rebuilding on salvaged collections (i.e. don't unconditionally run validate). Since validate can be expensive (though not so much as index building), it may be helpful to have it skip verified collections when all you want to do is recreate a deleted .wt collection file. |
| Comment by Eric Milkie [ 09/Aug/19 ] |
|
I guess I don't understand the use case for this feature then. Can you direct repair to only repair one collection, or does it always scan the data records for all collections? If the latter, I think we would have to change it so that instead of rebuilding all indexes, it would instead validate all indexes (and then only rebuild indexes that were salvaged or were for a collection that was salvaged). |
| Comment by Louis Williams [ 08/Aug/19 ] |
|
milkie if we choose to not rebuild indexes by default, we would need to consider running the "validate" command instead, because --repair would no longer be able to guarantee index consistency. I think an option to disable this behavior by default would be the safest approach. |
| Comment by Eric Milkie [ 05/Aug/19 ] |
|
Is there a reason why we wouldn't make this new behavior the default? With the elimination of mmap I'm not familiar with reasons why we need the ability to rebuild all indexes at repair time. |