[SERVER-67538] Multi-doc transactions should fail if on an old, incompatible snapshot Created: 27/Jun/22  Updated: 29/Oct/23  Resolved: 31/Aug/22

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 6.0.0, 5.0.10, 6.1.0-rc1
Fix Version/s: 5.0.14, 6.0.2, 6.1.0-rc1, 6.2.0-rc0

Type: Bug Priority: Critical - P2
Reporter: Josef Ahmad Assignee: Josef Ahmad
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
Related
is related to SERVER-68573 Remove special handling in index code... Closed
is related to SERVER-47866 Secondary readers do not need to reac... Closed
is related to SERVER-68455 Clean up and publish GDB helpers for ... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v6.1
Steps To Reproduce:

Run this noPassthrough jstest.

/**
 * @tags: [
 *   requires_replication,
 * ]
 */
(function() {
'use strict';
 
const replTest = new ReplSetTest({nodes: 2});
replTest.startSet();
replTest.initiate();
 
const primary = replTest.getPrimary();
const db = primary.getDB('test');
 
assert.commandWorked(db['c'].insertOne({_id: 5, num: 0}));
 
const s0 = db.getMongo().startSession();
s0.startTransaction();
assert.commandWorked(s0.getDatabase('test')['c'].deleteOne({_id: 5}));
s0.commitTransaction();
 
const clusterTime = s0.getClusterTime().clusterTime;
 
assert.commandWorked(db['c'].createIndex({'value': 1}));
 
// Start a transaction whose snapshot predates the completion of the index build, and which reserves
// an oplog entry (i.e. writes) after the index build commits.
try {
    const s1 = db.getMongo().startSession();
    s1.startTransaction({readConcern: {level: "snapshot", atClusterTime: clusterTime}});
    s1.getDatabase('test').c.insertOne({_id: 5, num: 1});
 
    // Transaction should have failed.
    assert(0);
} catch (e) {
    assert(e.hasOwnProperty("errorLabels"), tojson(e));
    assert.contains("TransientTransactionError", e.errorLabels, tojson(e));
    assert.eq(e["code"], ErrorCodes.SnapshotUnavailable, tojson(e));
}
 
replTest.stopSet();
})();

Sprint: Execution Team 2022-07-25, Execution Team 2022-08-08, Execution Team 2022-08-22, Execution Team 2022-09-05
Participants:
Linked BF Score: 100

 Description   

In the following scenario:

  1. Multi-doc transaction starts, reading from a snapshot @ timestamp (9,1).
  2. (10,1): index build on collection A completes.
  3. (11,1): transaction writes to collection A, which involves updating the index that was built at (10, 1).

 
At step 3 the transaction doesn't fail with SnapshotUnavailable code name + TransientTransactionError label + "Unable to read from a snapshot due to pending collection catalog changes; please retry the operation" message.
This is a bug that opens the opportunity for a race condition to happen, which can result in data inconsistency as the index key won't get updated.

Adding _indexCatalogEntry->isReady(opCtx) in _indexKeysOrWriteToSideTable resolves the issue: step 3 fails with SnapshotUnavailable, instead of silently progressing with an invalid, stale snapshot.



 Comments   
Comment by Githook User [ 28/Sep/22 ]

Author:

{'name': 'Josef Ahmad', 'email': 'josef.ahmad@mongodb.com', 'username': 'josefahmad'}

Message: SERVER-67538 Make multi-doc txns return WCE on index catalog changes

Background: SERVER-47866 stopped bumping the collection's minimum
visibility timestamp on catalog changes related to an index; only the
index's minimum visibility snapshot continues to be updated. One side
effect of this change is that a multi-document transaction can read a
at a snapshot where the index is not yet ready and commit at a
timestamp when the index is ready, which not intended behaviour and
can open the opportunity for a race to happen.

This patch introduces a check for the indices' minimum visible timestamp.
Attempting to write to an index entry while reading at an incompatible
timestamp returns a write conflict exception. Locking rules guarantee that
we see a consistent in-memory view of the indices' minimum visible
snapshot.

(cherry picked from commit a4bd3ce3607d2c3020d7efa3501240ae4b1a1b03)
(cherry picked from commit 4e80712214658e3c70cecef42680618068448e7f)
Branch: v5.0
https://github.com/mongodb/mongo/commit/cbc2fa05af5f8ffe0f2c8f3e37bdd9ac34b3ac85

Comment by Githook User [ 14/Sep/22 ]

Author:

{'name': 'Josef Ahmad', 'email': 'josef.ahmad@mongodb.com', 'username': 'josefahmad'}

Message: SERVER-67538 Make multi-doc txns return WCE on index catalog changes

Background: SERVER-47866 stopped bumping the collection's minimum
visibility timestamp on catalog changes related to an index; only the
index's minimum visibility snapshot continues to be updated. One side
effect of this change is that a multi-document transaction can read a
at a snapshot where the index is not yet ready and commit at a
timestamp when the index is ready, which not intended behaviour and
can open the opportunity for a race to happen.

This patch introduces a check for the indices' minimum visible timestamp.
Attempting to write to an index entry while reading at an incompatible
timestamp returns a write conflict exception. Locking rules guarantee that
we see a consistent in-memory view of the indices' minimum visible
snapshot.

(cherry picked from commit a4bd3ce3607d2c3020d7efa3501240ae4b1a1b03)
Branch: v6.0
https://github.com/mongodb/mongo/commit/899fb0a1adba6b61094cda293a5ed4fa985c5f64

Comment by Githook User [ 07/Sep/22 ]

Author:

{'name': 'Josef Ahmad', 'email': 'josef.ahmad@mongodb.com', 'username': 'josefahmad'}

Message: SERVER-67538 Make multi-doc txns return WCE on index catalog changes

Background: SERVER-47866 stopped bumping the collection's minimum
visibility timestamp on catalog changes related to an index; only the
index's minimum visibility snapshot continues to be updated. One side
effect of this change is that a multi-document transaction can read a
at a snapshot where the index is not yet ready and commit at a
timestamp when the index is ready, which not intended behaviour and
can open the opportunity for a race to happen.

This patch introduces a check for the indices' minimum visible timestamp.
Attempting to write to an index entry while reading at an incompatible
timestamp returns a write conflict exception. Locking rules guarantee that
we see a consistent in-memory view of the indices' minimum visible
snapshot.

(cherry picked from commit a4bd3ce3607d2c3020d7efa3501240ae4b1a1b03)
Branch: v6.1
https://github.com/mongodb/mongo/commit/fd61bf1c3764ee079d3dcde44a2eb24352c22ae5

Comment by Githook User [ 31/Aug/22 ]

Author:

{'name': 'Josef Ahmad', 'email': 'josef.ahmad@mongodb.com', 'username': 'josefahmad'}

Message: SERVER-67538 Make multi-doc txns return WCE on index catalog changes

Background: SERVER-47866 stopped bumping the collection's minimum
visibility timestamp on catalog changes related to an index; only the
index's minimum visibility snapshot continues to be updated. One side
effect of this change is that a multi-document transaction can read a
at a snapshot where the index is not yet ready and commit at a
timestamp when the index is ready, which not intended behaviour and
can open the opportunity for a race to happen.

This patch introduces a check for the indices' minimum visible timestamp.
Attempting to write to an index entry while reading at an incompatible
timestamp returns a write conflict exception. Locking rules guarantee that
we see a consistent in-memory view of the indices' minimum visible
snapshot.
Branch: master
https://github.com/mongodb/mongo/commit/a4bd3ce3607d2c3020d7efa3501240ae4b1a1b03

Generated at Thu Feb 08 06:08:23 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.