[SERVER-55558] Better Support for Rolling Index Builds Created: 26/Mar/21 Updated: 06/Dec/22 |
|
| Status: | Backlog |
| Project: | Core Server |
| Component/s: | Index Maintenance, Replication |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | New Feature | Priority: | Major - P3 |
| Reporter: | Rod Adams | Assignee: | Backlog - Storage Execution Team |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Assigned Teams: |
Storage Execution
|
| Participants: |
| Description |
|
More and more, when faced with needing to do index build/rebuilds on a production system, we advise customers to do a rolling build, where you take each secondary down, load it standalone, do the work, bring it back into the replset, let lag catch up, repeat on another server.
My idea would be to make this process easier, but building a mongod command which would address the span of doing builds on a single node. The rough process would be:
Motivation: the current process of restarting the node in standalone mode is extremely cumbersome, especially if one's deploying in Kubernetes or other such places, where doing one off configuration changes are an extra load. Additionally, some environments lock down their configs for security purposes. The easier we make it for people to do things in the correct fashion, the more likely they will do it correctly.
A full rolling build would still require some external agent to trigger each node in sequence, and then wait for repl lag to catch up between each, but this would mean a DBA could affect the process with three commands, and staring at some graphs. |
| Comments |
| Comment by Rod Adams [ 29/Mar/21 ] |
|
louis.williams, I'd say there are two main reasons which still come into play, both revolve around performance: 1) Building indexes is expensive. This can radically change the performance profile that you serve with. Building in a rolling fashion allows you isolate this expense, allowing the remaining nodes to service traffic with a performance profile much in line with normal operations. 2) There are times when you need to rebuild an index, which means dropping it, then building. During the build window, operations which need that index are shifted to less performant execution plan. I could some design options to overcome issue #2, but #1 would be much more difficult.
|
| Comment by Louis Williams [ 26/Mar/21 ] |
|
rod.adams, are you familiar with the 4.4 work that allowed indexes to be built simultaneously on all replica-set nodes? Some docs here? We were hoping this would replace many usages of rolling index builds and solve many of the operational problems with index builds, in general. Do you still require rolling index builds to avoid the impact on concurrent workloads? Or some other reason? |