Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Unresolved
Priority: Major - P3
Fix Version/s: None
Affects Version/s: None
Component/s: None
Labels:
None

Assigned Teams:

Replication
Operating System:
ALL
Backport Requested:

v8.1
Sprint:
ClusterScalability Mar31-Apr14, ClusterScalability Apr28-May09
Linked BF Score:
0
Confidence Status:
None
Work Order:
3
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

Inside a retryable internal transaction, only the statements with stmtIds are retryable. In other words, for a retryable internal transaction used to executed writes for a particular statement in a retryable write command, only first write statement is retryable. The contract is that the owner of the transaction must not execute the subsequent write statements if the response for the first statement contains "retriedStmtIds" which indicates that the statement had already been executed.

Consider a sharded cluster with shard0 and shard1.

Let's say a client runs a write command in a session with retryWrites: true against mongos0. The retryable write translates to with two write statements w0 against shard0 and w1 against shard1 that need to be executed atomically (in a retryable internal transaction).
- w0 is retryable and has "stmtId": 0.
- w1 is not retryable and has "stmtId": -1.
While the retryable internal transaction is committing (using two-phase commit), the client loses the connection to mongos0. So it retries the write command against mongos1.
The retry causes mongos1 to start a retryable internal transaction for the retry. Let's say w1 arrives on shard0 right after shard0 has committed the transaction but shard1 has not. The session was also successfully checked out on shard0:prim since it is no active operation. The transaction had already been committed on shard0 so the retry didn't get a RetryableTransactionInProgress error and get blocked here. Due to the history for the transaction started in step (1), the checkStatementExecuted() check for w0 indicated the the statement had already been executed so it responds with "retriedStmtIds" : [ 0 ]. This causes mongos1 to conclude that this was a retry so it must not execute w1 and instead can return early immediately.
If the client performs a read immediately after this without providing afterClusterTime >= commitTimestamp (i.e. before shard1 commits the transaction), it would not see w2. This breaks reads-your-own-writes.

In short, for a retryable internal transaction that involves writing to multiple shards and has short-circuiting, read-your-own-writes is only guaranteed in sessions with "causalConsistency" enabled.

is related to

SERVER-104545 Audit sharding uses of internal transaction API to see if we perform short circuting

Backlog

related to

SERVER-99782 Read-your-own-writes guarantee for a cross-shard transaction if the client recovers the decision from the recovery shard that is not also a coordinator shard

Backlog

Assignee:: Unassigned
Reporter:: Cheahuychou Mao
Participants:: Cheahuychou Mao
Votes:: 0 Vote for this issue
Watchers:: 7 Start watching this issue

Created:: Jan 23 2025 10:10:33 PM UTC
Updated:: Apr 30 2025 08:32:55 PM UTC
Confidence Status Last Update:: 17/Mar/25 5:38 PM

Details

Description

Attachments

Issue Links

Activity

People

Dates