[SERVER-39847] Migrating session info can trigger fassert when destination shard has transaction history truncated by oplog Created: 26/Feb/19  Updated: 29/Oct/23  Resolved: 28/Feb/19

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: 3.6.12, 4.0.7, 4.1.9

Type: Bug Priority: Major - P3
Reporter: Randolph Tan Assignee: Randolph Tan
Resolution: Fixed Votes: 1
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File small_oplog.js    
Issue Links:
Backports
Depends
Duplicate
is duplicated by SERVER-40324 sharded cluster backtraces with error... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v4.0, v3.6
Sprint: Sharding 2019-03-11
Participants:
Case:

 Description   

Note: description is based on v4.0 code, code organization in current master changed a little bit, but the story is the same.

1. Shard0 does retryable write on stmt: 0.
2. Shard0 migrates chunk to Shard1.
3. History of stmt: 0 gets transferred to Shard1.
4. Time passes such that history of stmt: 0 in the oplog gets rolled over.
5. Shard0 migrates chunk to Shard1 again. (Note: to trigger this bug, shard0 should not have rolled over the oplog yet!)
6. Shard1 checks if stmt is already executed.
7. Shard1 has stmt already in the cached map, but when it tries to retrieve the actual oplog, it'll realize that the oplog was already truncated and throw IncompleteTransactionHistory.
8. The exception gets caught but it lets it through as an attempt to "repair/recover" lost history.
9. However, after it inserts the oplog entry and the commit callback gets executed, it will find out that the stmt is already in the map and triggers the fassert.



 Comments   
Comment by Githook User [ 07/Mar/19 ]

Author:

{'name': 'Randolph Tan', 'username': 'renctan', 'email': 'randolph@10gen.com'}

Message: SERVER-39847 Migrating session info can trigger fassert when destination shard has transaction history truncated by oplog

(cherry picked from commit 1466c2b24eef41805dfac73e2fb43256d6d8fae7)
(cherry picked from commit 8187116fe23a02f60bc2ed6dcdfa32d91b6e2c43)
Branch: v3.6
https://github.com/mongodb/mongo/commit/557dedf9b693ecfc6f8a170a125a59c3faefd0b8

Comment by Githook User [ 28/Feb/19 ]

Author:

{'name': 'Randolph Tan', 'username': 'renctan', 'email': 'randolph@10gen.com'}

Message: SERVER-39847 Migrating session info can trigger fassert when destination shard has transaction history truncated by oplog

(cherry picked from commit 1466c2b24eef41805dfac73e2fb43256d6d8fae7)
Branch: v4.0
https://github.com/mongodb/mongo/commit/8187116fe23a02f60bc2ed6dcdfa32d91b6e2c43

Comment by Githook User [ 27/Feb/19 ]

Author:

{'name': 'Randolph Tan', 'username': 'renctan', 'email': 'randolph@10gen.com'}

Message: SERVER-39847 Migrating session info can trigger fassert when destination shard has transaction history truncated by oplog
Branch: master
https://github.com/mongodb/mongo/commit/1466c2b24eef41805dfac73e2fb43256d6d8fae7

Comment by Randolph Tan [ 26/Feb/19 ]

I have confirmed that this is a problem in 4.0 and master. Code in v3.6 looks very similar to v4.0, so I imagine it has the problem as well.

Comment by Kaloian Manassiev [ 26/Feb/19 ]

I prefer the second option for the following considerations:

  1. It will result in a much simpler code that we don't have to worry about backporting
  2. The recipient shard's history in most cases will still be "incomplete" even if we start differentiating these two cases and append the history from the donor
  3. Incomplete histories arise as a result of oplog truncation combined with session expiration. With reasonably-sized oplog, there should still be plenty of time for a caller, which received an error to retry the operation

This is a problem in both 3.6 and 4.0, right?

Comment by Randolph Tan [ 26/Feb/19 ]

One way to fix this is to be "smart" and differentiate between "I know I have written this before but forgot about it" vs "I don't know about this write but unsure because my history is incomplete". Or flat out just ignore the incoming transaction history if the retryableWrite state is in the state where it has incompleteHistory (basically we are treating the incoming write history as "lost" as well).

Generated at Thu Feb 08 04:53:17 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.