[SERVER-38204] Remove difference between readConcern:"majority" and readConcern:"snapshot" Created: 19/Nov/18 Updated: 27/Oct/23 Resolved: 30/Nov/18 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication, Storage |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Matthew Russotto | Assignee: | Tess Avitabile (Inactive) |
| Resolution: | Works as Designed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||
| Backport Requested: |
v4.0
|
||||
| Sprint: | Repl 2018-12-03, Repl 2018-12-17 | ||||
| Participants: | |||||
| Description |
|
Right now there is a subtle difference between readConcern: "majority" and readConcern : "snapshot" when using transactions. A transaction done at readConcern : "majority" may provide a snapshot that reflects oplog holes – that is, we may see writes that are ordered after writes we do not see. This seems like less than useful behavior and it complicates the code, so we should remove it. This would involve removing the SpeculativeTransactionOpTime enum in transaction_participant.cpp, and all behavior associated with it (we should always use AllCommitted behavior). We can also remove RecoveryUnit::ReadSource::kLastAppliedSnapshot from the recovery_unit, and all associated behavior. |
| Comments |
| Comment by Matthew Russotto [ 30/Nov/18 ] |
|
That seems right, unfortunately. We can't guarantee when "w:1" writes will get ahead of the all-committed time. |
| Comment by Tess Avitabile (Inactive) [ 30/Nov/18 ] |
|
matthew.russotto, I believe that majority and local both read from the lastApplied rather than the allCommitted. It is important that transactions with local readConcern read from the lastApplied so that back-to-back w:1 transactions with local readConcern can read the previous transaction. Since we require that transactions with local readConcern can read from the lastApplied, I don't think we can gain any simplification by making majority behave the same as snapshot. Let me know if I misunderstood. |
| Comment by Eric Milkie [ 19/Nov/18 ] |
|
Indeed I was mistaken, thanks Andy. |
| Comment by Andy Schwerin [ 19/Nov/18 ] |
|
Transactions execute speculatively, so the snapshot used is not necessarily majority committed when the transaction begins, milkie. |
| Comment by Eric Milkie [ 19/Nov/18 ] |
I'm not sure that's true. I can't think of a situation where the timestamp used for majority read concern would be ahead of the oplog visibility timestamp. |
| Comment by Andy Schwerin [ 19/Nov/18 ] |
|
We did this intentionally to speed up back-to-back transactions, I think. I'm not saying we should keep it, but we did it because it should allow new transactions to start slightly sooner at "majority" read concern, since they don't have to wait for certain unrelated transactions to commit. |