[SERVER-58647] What is the timestamp used in snapshot distributed transaction? All durable or atClusterTime? Created: 17/Jul/21  Updated: 27/Oct/23  Resolved: 28/Jul/21

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 4.4.6
Fix Version/s: None

Type: Question Priority: Major - P3
Reporter: Ouyang Tsuna Assignee: Dmitry Agranat
Resolution: Community Answered Votes: 0
Labels: Transactions, snapshot
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Participants:

 Description   

The transaction with snapshot read concern provides ACID properties for users.

As I known, it is based on the timestamp transaction in wiredtiger.

A question makes me puzzled that what is the timestamp used in snapshot distributed transaction? 

When I refered to the document, it shows

 

From Shard Internals in GitHub

Snapshot read concern will choose a snapshot from which the transaction will read. If it is specified with an `atClusterTime` argument, then that will be used as the transaction's read timestamp. If `atClusterTime` is not specified, then the read timestamp of the transaction will be the [`all_durable`]timestamp when the transaction is started, which ensures a snapshot with no oplog holes.

If the mongodb is deployed in relicat set, the snapshot transaction reads data before all_durable timestamp which ensures a snapshot with no oplog holes.

However, in shard deployment, each shard has its own all_durable timestamp, but the meaning of "snapshot" requires that each shard reads from the same timestamp, I think it is the atClusterTime from mongos and it is the clusterTime in mongos, right?

However, this will not lead to holes in the oplog?

 

 



 Comments   
Comment by Dmitry Agranat [ 28/Jul/21 ]

tsunaouyang@gmail.com, as Daniel has answered your question, I will go ahead and close this ticket. If you have further questions, we'd like to encourage you to start by asking our community for help by posting on the MongoDB Developer Community Forums.

Regards,
Dima

Comment by Daniel Gottlieb (Inactive) [ 19/Jul/21 ]

I expect reads using atClusterTime to hit this clause which ensures there are no holes (that function actually guarantees something stronger and there is perhaps room for optimization).

For primaries that committed a transaction (being primaries are where we typically concern ourselves with holes), there's a chain of causality that I believe makes waiting unnecessary. Specifically, I believe that committing a transaction always uses the majority write concern. A primary returning a majority committed timestamp T implies it has no holes <= T.

Generated at Thu Feb 08 05:45:04 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.