[SERVER-61837] [v4.4] Ensure waiting for majority write concern after index creation in the destination shard of a migration on empty collections Created: 01/Dec/21 Updated: 29/Oct/23 Resolved: 19/Jan/22 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | 4.4.13 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Dianna Hohensee (Inactive) | Assignee: | Marcos José Grillo Ramirez |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | sharding-wfbf-day | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||
| Operating System: | ALL | ||||||||||||
| Sprint: | Sharding EMEA 2021-12-13, Sharding EMEA 2021-12-27, Sharding EMEA 2022-01-10, Sharding EMEA 2022-01-24 | ||||||||||||
| Participants: | |||||||||||||
| Linked BF Score: | 42 | ||||||||||||
| Description |
|
As part of PM-812 range deletions were made resumable after a stepdown, and as part of that project, a waitForWriteConcern was added after writing the range deleter document locally. This code besides it's initial intended purpose (wait for the write of the range deleter document to be majority committed), had the additional consequence to wait for the index creation to be replicated in a majority of nodes, however, this only works if we're working on fcv 4.4. If there was a fcv change to 4.2 the following scenario might occur: 1. A migration starts with a destination shard that previously didn't have any chunks for the collection We need to ensure the index is created and successfully replicated on destination shard regardless of the FCV version. |
| Comments |
| Comment by Githook User [ 19/Jan/22 ] |
|
Author: {'name': 'Marcos José Grillo Ramirez', 'email': 'marcos.grillo@mongodb.com', 'username': 'm4nti5'}Message: |
| Comment by Marcos José Grillo Ramirez [ 07/Jan/22 ] |
|
Adapted title and description to latest find. |
| Comment by Marcos José Grillo Ramirez [ 04/Jan/22 ] |
|
After some investigation, there seems to be a problem with the wait for write concern in the migration destination manager, because it should've ensured the index creation in the majority of nodes, which would have prevented the BF from happening. More investigation is needed in order to determine why that wait could've been ignored. |