[SERVER-43097] Improve log messaging when index builds get foregrounded because of startup recovery Created: 30/Aug/19 Updated: 29/Oct/23 Resolved: 01/Jun/20 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Diagnostics, Storage |
| Affects Version/s: | 4.0.12, 4.2.0 |
| Fix Version/s: | 4.0.20, 4.2.8, 4.4.0-rc9, 4.7.0 |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Daniel Gottlieb (Inactive) | Assignee: | Gregory Noma |
| Resolution: | Fixed | Votes: | 1 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||
| Backport Requested: |
v4.4, v4.2, v4.0
|
||||||||||||
| Sprint: | Execution Team 2020-06-01, Execution Team 2020-06-15 | ||||||||||||
| Participants: | |||||||||||||
| Case: | (copied to CRM) | ||||||||||||
| Description |
|
When startup recovery (storage and replication) encounter an unfinished background index build, they rebuild the index in the foreground. Startup blocks until all indexes in this state complete building. The logging is clear that the indexes are being built as part of recovery, but it's not clear that the background: true flag on the index definition is explicitly being ignored. |
| Comments |
| Comment by Githook User [ 15/Jun/20 ] |
|
Author: {'name': 'Gregory Noma', 'email': 'gregory.noma@gmail.com', 'username': 'gregorynoma'}Message: |
| Comment by Githook User [ 05/Jun/20 ] |
|
Author: {'name': 'Gregory Noma', 'email': 'gregory.noma@gmail.com', 'username': 'gregorynoma'}Message: |
| Comment by Githook User [ 05/Jun/20 ] |
|
Author: {'name': 'Gregory Noma', 'email': 'gregory.noma@gmail.com', 'username': 'gregorynoma'}Message: (cherry picked from commit d29bb7d63843faad9f63974d89253586dff6cae5) |
| Comment by Githook User [ 01/Jun/20 ] |
|
Author: {'name': 'Gregory Noma', 'email': 'gregory.noma@gmail.com', 'username': 'gregorynoma'}Message: |
| Comment by Daniel Gottlieb (Inactive) [ 05/Sep/19 ] |
|
I want to provide some context on why unfinished background indexes will be rebuilt in the foreground when MongoD starts up beginning in 4.0. To start, this is the lifetime of a background index build in a replica set:
In a running, steady-state node, we have a way for choosing a valid time. At startup, this is trickier. To complicate things: The document being written to (as in, the unit of modification in the storage engine) actually captures the entire state of a collection. This write to signify an index completing can conflict with any other change to a collection's metadata such as any collations, schema validation rules, capped collections, other indexes, etc. Choosing an incorrect time can result in a rollback not returning a node to the correct state (e.g the rolling back node may disagree with other replica set members on what schema validation rules are present). The easiest way to make sure a proper time is chosen at startup is to foreground the index and use the timestamp of the oplog entry itself. We could probably find a solution that doesn't foreground the index build, but secondaries have long been limited in that additional background builds will block until the prior one completes (in many cases, leaving the server in the same position). These limitations are ones we're actively working on removing. |
| Comment by Eric Milkie [ 05/Sep/19 ] |
|
Note that this doesn't really affect 4.2.0 because there is no distinction between background and foreground index builds in 4.2; everything is one type of index build (the "background" flag is always ignored for every index build, despite its presence), and the system attempts all index builds without locking collections if it can. |
| Comment by Daniel Gottlieb (Inactive) [ 30/Aug/19 ] |
|
nuno.costa I'm a little hesitant to remove the background: true flag from the index definition being logged. What do you think about the following two changes:
|