[SERVER-57345] Retryable write pre-fetch missing documents Created: 01/Jun/21 Updated: 29/Oct/23 Resolved: 08/Jun/21 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | None |
| Fix Version/s: | 5.1.0-rc0 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Wenbin Zhu | Assignee: | Wenbin Zhu |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||
| Operating System: | ALL | ||||||||
| Sprint: | Repl 2021-06-14 | ||||||||
| Participants: | |||||||||
| Description |
|
During the investigation of
Here is the test script to repro this issue:
|
| Comments |
| Comment by Vivian Ge (Inactive) [ 06/Oct/21 ] | |||||||||||||||||
|
Updating the fixversion since branching activities occurred yesterday. This ticket will be in rc0 when it’s been triggered. For more active release information, please keep an eye on #server-release. Thank you! | |||||||||||||||||
| Comment by Wenbin Zhu [ 08/Jun/21 ] | |||||||||||||||||
|
Fixed as part of | |||||||||||||||||
| Comment by Wenbin Zhu [ 08/Jun/21 ] | |||||||||||||||||
|
The fix will be included in | |||||||||||||||||
| Comment by Wenbin Zhu [ 08/Jun/21 ] | |||||||||||||||||
|
lingzhi.deng pavithra.vetriselvan After some investigation, I think the root cause is indeed the incorrect order of the filtering stage and the sorting stage in one particular edge case. This happens when the startFetchingTimestamp is equal to the timestamp of the last retryable write. The sorting stage is not using $sort, it's uses $reduce to perform something similar to a bucket sort. It uses $$this.depthForTenantMigration as the index in the "history" array returned from the previous $graphLookup stage. This causes the problem because in this case, after the filtering stage, the entry with the largest timestamp (which is equal to startFetchingTimestamp) is removed, so if we originally have 5 entries, after filtering, the depthForTenantMigration of the entries becomes [1, 4] (0 is filtered out), however the actual index of the array is [0, 3]. So this index mismatch results in sort stages uses the incorrect index for its bucket sort. As an example, say we originally have 5 entries: [E_0_4, E_1_3, E_2_2, E_3_1, E_4_0] where E_0_4 means the first entry with 0 as its timestamp, and 4 as its depthForTenantMigration. Let's take a look at how the sort stage works:
The sort stages sorts the entries in ascending order of depthForTenantMigration. Each iteration in $reduce places the current entry into its corresponding index in the sorted array. It does this by stitching three subarrays: the sorted left subarray (line 6), the current entry (line 7) and the sorted right subarray (line 8-14). Due to the filter stage, E_4_0 got filtered out. So we have [E_0_4, E_1_3, E_2_2, E_3_1] and the $size of "history"array becomes 4. In line 4 it uses $range to create the array [0, 1, 2, 3] as the initial value, then if E_0_4 is the first entry we process in $reduce (not necessary since $graphLookup does not guarantee order), since depthForTenantMigration is 4, the left subarray is [0, 1, 2, 3], and the right subarray is [0] (the $subtract and $add are supposed to result in a value n that is <= 0, so then slice can return a subarray with the last n values in the original array, but here we get a positive number 1, so the first element in the array is incorrectly returned). So the result array after this iteration is [0, 1, 2, 3, E_0_4, 0]. This is already incorrect, but not data loss yet. Next if we process E_1_3, the depthForTenantMigration is 3, so the left subarray is [0, 1, 2], the right subarray is an empty array (3 + 1 - 4 = 0). So the result array after this iteration is [0, 1, 2, E_1_3], and E_0_4 is lost. Since $graphLookup does not guarantee the order of the returned array, the lost entries can be random, which corresponding to the behavior in the ticket description. If we reverse the two stages, then the sorting stage can process the full array and this issue will not happen. | |||||||||||||||||
| Comment by Wenbin Zhu [ 03/Jun/21 ] | |||||||||||||||||
|
During the discussion of this bug, pavithra.vetriselvan suggested trying to reverse the order of the filtering stage and the next reduce/sorting stage, which seems to fix the bug, though we are not yet certain of the exact explanation of this, but intuitively the filtering stage should be after reduce/sorting stage. Will look further into this and confirm, and will also include the fix in | |||||||||||||||||
| Comment by Wenbin Zhu [ 01/Jun/21 ] | |||||||||||||||||
|
lingzhi.deng pavithra.vetriselvan We might need to investigate this one. |