[SERVER-39847] Migrating session info can trigger fassert when destination shard has transaction history truncated by oplog Created: 26/Feb/19 Updated: 29/Oct/23 Resolved: 28/Feb/19 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Sharding |
| Affects Version/s: | None |
| Fix Version/s: | 3.6.12, 4.0.7, 4.1.9 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Randolph Tan | Assignee: | Randolph Tan |
| Resolution: | Fixed | Votes: | 1 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||||||||||||||
| Issue Links: |
|
||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||
| Operating System: | ALL | ||||||||||||||||
| Backport Requested: |
v4.0, v3.6
|
||||||||||||||||
| Sprint: | Sharding 2019-03-11 | ||||||||||||||||
| Participants: | |||||||||||||||||
| Case: | (copied to CRM) | ||||||||||||||||
| Description |
|
Note: description is based on v4.0 code, code organization in current master changed a little bit, but the story is the same. 1. Shard0 does retryable write on stmt: 0. |
| Comments |
| Comment by Githook User [ 07/Mar/19 ] |
|
Author: {'name': 'Randolph Tan', 'username': 'renctan', 'email': 'randolph@10gen.com'}Message: (cherry picked from commit 1466c2b24eef41805dfac73e2fb43256d6d8fae7) |
| Comment by Githook User [ 28/Feb/19 ] |
|
Author: {'name': 'Randolph Tan', 'username': 'renctan', 'email': 'randolph@10gen.com'}Message: (cherry picked from commit 1466c2b24eef41805dfac73e2fb43256d6d8fae7) |
| Comment by Githook User [ 27/Feb/19 ] |
|
Author: {'name': 'Randolph Tan', 'username': 'renctan', 'email': 'randolph@10gen.com'}Message: |
| Comment by Randolph Tan [ 26/Feb/19 ] |
|
I have confirmed that this is a problem in 4.0 and master. Code in v3.6 looks very similar to v4.0, so I imagine it has the problem as well. |
| Comment by Kaloian Manassiev [ 26/Feb/19 ] |
|
I prefer the second option for the following considerations:
This is a problem in both 3.6 and 4.0, right? |
| Comment by Randolph Tan [ 26/Feb/19 ] |
|
One way to fix this is to be "smart" and differentiate between "I know I have written this before but forgot about it" vs "I don't know about this write but unsure because my history is incomplete". Or flat out just ignore the incoming transaction history if the retryableWrite state is in the state where it has incompleteHistory (basically we are treating the incoming write history as "lost" as well). |