[SERVER-39753] getMore commands on aggregate cursors with "majority" read concern may return uncommitted data on secondaries Created: 22/Feb/19  Updated: 22/Mar/19  Resolved: 22/Mar/19

Status: Closed
Project: Core Server
Component/s: Querying, Replication
Affects Version/s: 4.0.6, 4.1.8
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: William Schultz (Inactive) Assignee: Bernard Gorman
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
duplicates SERVER-39356 Change stream updateLookup queries wi... Closed
Operating System: ALL
Steps To Reproduce:

/**
 * Test if 'getMore' on an aggregate cursor returns uncommitted data.
 */
(function() {
    "use strict";
 
    load("jstests/libs/write_concern_util.js");  // for [stop|restart]ServerReplication.
 
    const name = "test";
    const replTest = new ReplSetTest({name: name, nodes: 4});
    replTest.startSet();
    replTest.initiate();
 
    const dbName = name;
    const collName = "test";
 
    const primary = replTest.getPrimary();
    let secondaries = replTest.getSecondaries();
    assert.eq(secondaries.length, 3);
    const primaryDb = primary.getDB(dbName);
    const secondaryDb = secondaries[0].getDB(dbName);
    const primaryColl = primaryDb[collName];
 
    // Do some initial writes on primary and commit them.
    assert.commandWorked(
        primaryColl.insert([{_id: 1}, {_id: 2}, {_id: 3}], {writeConcern: {w: 4}}));
    replTest.awaitLastOpCommitted();
 
    // Pause replication on two secondaries.
    jsTestLog("Stopping replication on 2 secondaries.");
    stopServerReplication(secondaries[1]);
    stopServerReplication(secondaries[2]);
 
    jsTestLog("Doing some writes on primary.");
    // Do a write and replicate to 1 secondary. This write is not majority committed yet.
    assert.commandWorked(
        primaryColl.insert([{_id: 4}, {_id: 5}, {_id: 6}], {writeConcern: {w: 2}}));
 
    jsTestLog("Doing reads on secondary.");
 
    // A majority 'find' query. This should only return doc ids {1,2,3}.
    let findDocs = secondaryDb[collName].find().readConcern("majority").toArray();
    assert.sameMembers(findDocs, [{_id: 1}, {_id: 2}, {_id: 3}]);
 
    // Do a majority 'getMore' on aggregate cursor.This should only return doc ids {1,2,3}, since
    // docs {4,5,6} are not majority committed yet.
    let res = secondaryDb.runCommand({
        aggregate: collName,
        pipeline: [],
        readConcern: {level: "majority"},
        cursor: {batchSize: 0}
    });
    let cursorId = res.cursor.id;
    res = secondaryDb.runCommand({getMore: cursorId, collection: collName});
    assert.sameMembers(res.cursor.nextBatch, [{_id: 1}, {_id: 2}, {_id: 3}]);
 
    // Restart replication to let test complete.
    restartServerReplication(secondaries[1]);
    restartServerReplication(secondaries[2]);
 
    replTest.stopSet();
})();

Sprint: Query 2019-04-08
Participants:

 Description   

When running a getMore command against a cursor that was originally created by an aggregate command with read concern "majority", we may return data that is not majority committed if the command is run against a replica set secondary. The read concern for a getMore command is not stored directly on its OperationContext. It instead inherits its read concern from the cursor object, and we will set the timestamp read source to kMajorityCommitted for getMore commands here. This happens before we generate the getMore batch, however. For aggregate commands, locks will only be taken once we enter the batch generation phase, and, for reads, these locks will be taken inside the AutoGetCollectionForRead constructor. The AutoGetCollectionForRead constructor determines if a command should read at the lastApplied timestamp based on the read concern on the OperationContext. For a getMore command, there shouldn't be a read concern on the OperationContext since it is inherited from the cursor. This means that we will decide to read at the lastApplied timestamp on secondaries, overriding the majority committed read source we previously set.

This has implications for getMore commands run on standard aggregate query cursors and also for change streams, since they create cursors via the aggregate command i.e. it is possible for a change stream run against a secondary to return data that are not majority committed. I believe this issue goes back to 4.0, since that was when the logic around reading from the lastApplied timestamp on secondaries was added. The attached repro shows the same issue when run against a 4.0 server.



 Comments   
Comment by Bernard Gorman [ 22/Mar/19 ]

Confirmed, this issue has gone away as of SERVER-39356 commit c83b8d8aab53f7545851a76425b2f2cd7c598cbd:

[executor] 2019-03-22T16:42:48.611+0100 Summary: All 1 test(s) passed in 69.20 seconds.
[resmoke] 2019-03-22T16:42:48.611+0100 ================================================================================
[resmoke] 2019-03-22T16:42:48.612+0100 Summary of no_passthrough suite: All 1 test(s) passed in 69.20 seconds.
1 test(s) ran in 69.20 seconds (1 succeeded, 0 were skipped, 0 failed, 0 errored)
    js_tests: All 1 test(s) passed in 69.20 seconds.
[resmoke] 2019-03-22T16:42:48.612+0100 Exiting with code: 0

The attached repro fails as expected when run against the commit immediately preceding the above:

[js_test:SERVER-39753] 2019-03-22T16:39:47.478+0100 2019-03-22T16:39:47.478+0000 E  QUERY    [js] uncaught exception: Error: [
[js_test:SERVER-39753] 2019-03-22T16:39:47.478+0100 	{
[js_test:SERVER-39753] 2019-03-22T16:39:47.479+0100 		"_id" : 1
[js_test:SERVER-39753] 2019-03-22T16:39:47.479+0100 	},
[js_test:SERVER-39753] 2019-03-22T16:39:47.479+0100 	{
[js_test:SERVER-39753] 2019-03-22T16:39:47.479+0100 		"_id" : 2
[js_test:SERVER-39753] 2019-03-22T16:39:47.479+0100 	},
[js_test:SERVER-39753] 2019-03-22T16:39:47.479+0100 	{
[js_test:SERVER-39753] 2019-03-22T16:39:47.479+0100 		"_id" : 3
[js_test:SERVER-39753] 2019-03-22T16:39:47.479+0100 	},
[js_test:SERVER-39753] 2019-03-22T16:39:47.479+0100 	{
[js_test:SERVER-39753] 2019-03-22T16:39:47.480+0100 		"_id" : 5
[js_test:SERVER-39753] 2019-03-22T16:39:47.480+0100 	},
[js_test:SERVER-39753] 2019-03-22T16:39:47.480+0100 	{
[js_test:SERVER-39753] 2019-03-22T16:39:47.480+0100 		"_id" : 4
[js_test:SERVER-39753] 2019-03-22T16:39:47.480+0100 	},
[js_test:SERVER-39753] 2019-03-22T16:39:47.480+0100 	{
[js_test:SERVER-39753] 2019-03-22T16:39:47.480+0100 		"_id" : 6
[js_test:SERVER-39753] 2019-03-22T16:39:47.480+0100 	}
[js_test:SERVER-39753] 2019-03-22T16:39:47.480+0100 ] != [ { "_id" : 1 }, { "_id" : 2 }, { "_id" : 3 } ] :
[js_test:SERVER-39753] 2019-03-22T16:39:47.480+0100 doassert@src/mongo/shell/assert.js:20:14
[js_test:SERVER-39753] 2019-03-22T16:39:47.481+0100 failAssertion@src/mongo/shell/assert.js:234:13
[js_test:SERVER-39753] 2019-03-22T16:39:47.481+0100 assert.sameMembers@src/mongo/shell/assert.js:238:13
[js_test:SERVER-39753] 2019-03-22T16:39:47.481+0100 @jstests/noPassthrough/SERVER-39753.js:55:5
[js_test:SERVER-39753] 2019-03-22T16:39:47.481+0100 @jstests/noPassthrough/SERVER-39753.js:4:2

Marking this as a duplicate of SERVER-39356.

Comment by Charlie Swanson [ 20/Mar/19 ]

bernard.gorman can you take a look and see if this is still a problem?

Comment by William Schultz (Inactive) [ 20/Mar/19 ]

Note that this may automatically be fixed by the changes from SERVER-39356. See this new logic.

Generated at Thu Feb 08 04:53:00 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.