[SERVER-66777] FLE 2 internal transactions are being interrupted by logical session reaper Created: 26/May/22  Updated: 29/Oct/23  Resolved: 03/Jun/22

Status: Closed
Project: Core Server
Component/s: Field Level Encryption
Affects Version/s: None
Fix Version/s: 6.0.0-rc9, 6.1.0-rc0

Type: Bug Priority: Critical - P2
Reporter: Mark Benvenuto Assignee: Cheahuychou Mao
Resolution: Fixed Votes: 0
Labels: equality-ga, ga-required
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Initiative
Related
related to SERVER-78187 SERVER-78187 Make killing a child ses... Closed
is related to SERVER-66852 Remove the Session objects for expire... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v6.0
Steps To Reproduce:

1. Start single node replica set, i.e. ./mongo --replSet test
2. ./mongo --eval "rs.initiate();"
3. Run script. It takes ~40 minutes to hit the error.

db.getSiblingDB("test").dropDatabase();
 
db.keystore.drop();
 
const localKMS = {
    key: BinData(
        0,
        "/tu9jUCBqZdwCelwE/EAm/4WqdxrSMi04B8e9uAV+m30rI1J2nhKZZtQjdvsSCwuI4erR6IEcEK+5eGUAODv43NDNIR9QheT2edWFewUfHKsl9cnzTc86meIzOmYl6dr")
};
 
 
const clientSideFLEOptions = {
    kmsProviders: {
        local: localKMS,
    },
    keyVaultNamespace: "test.keystore",
    schemaMap: {},
};
 
const shell = Mongo("localhost:27017", clientSideFLEOptions);
const edb = shell.getDB("test");
 
const keyVault = shell.getKeyVault();
 
keyVault.createKey(
    "local", "arn:aws:kms:us-east-1:579766882180:key/89fcc2c4-08b0-4bd9-9f25-e3687b580d0");
 
assert.commandWorked(db.createEncryptedCollection("basic", {
    encryptedFields: {
        "fields": [
            {
                "path": "ssn",
                "keyId": keyVault.createKey(
                    "local", "arn:aws:kms:us-east-1:579766882180:key/89fcc2c4-08b0-4bd9-9f25-e3687b580d0"),
                "bsonType": "string",
                "queries": { "queryType": "equality" }
            },
            {
                "path": "mobile",
                "keyId": keyVault.createKey(
                    "local", "arn:aws:kms:us-east-1:579766882180:key/89fcc2c4-08b0-4bd9-9f25-e3687b580d0"),
                "bsonType": "string",
                "queries": { "queryType": "equality" }
            },
            {
                "path": "gender",
                "keyId": keyVault.createKey(
                    "local", "arn:aws:kms:us-east-1:579766882180:key/89fcc2c4-08b0-4bd9-9f25-e3687b580d0"),
                "bsonType": "string",
                "queries": { "queryType": "equality" }
            }
        ]
    }
}));
 
print("Starting Load: " + new Date());
 
for (let i = 0; i < 10000000; i++) {
    if (i % 100 == 0) { print(i + " ") }
 
    let mobile = Math.random().toString().slice(2, 11);
    mobile = mobile.slice(0, 3) + "-" + mobile.slice(3, 6) + "-" + mobile.slice(6);
 
    const ssn = Math.random().toString().slice(2, 11);
    const g = Math.floor(Math.random() * 10);
    const gender = g < 2 ? "they" : g < 6 ? "she" : "he";
 
    try {
        edb.basic.insertOne({
            name: "John Doe",
            mobile: mobile,
            ssn: ssn,
            address: "157 Electric Ave.",
            gender: gender,
        });
    } catch (e) {
        let d = new Date();
        print("\nException on load: " + d + "||" + e);
    }
}
 
print("Done loading: " + new Date())

4. After 40 minutes, operations will be interrupted and visible in the shell as follows

Exception on load: <DATE> ||WriteError({
    "index": 0,
    "code" : 11601,
    "errmsg" : "operation was interrupted",
    ... 
})

Sprint: Sharding NYC 2022-05-30, Sharding NYC 2022-06-13
Participants:

 Description   

While performance testing FLE 2, we have observed some internal transactions are being interrupted during the load phase. In our simple tests, we use a single connection from the mongo shell and load one document at a time. After approximately 30 minutes, the logical session cache reaper interrupts FLE 2 internal transactions approximately every 5 minutes.

The same single connection is used through the load phase. The same lsid is used.

This issue can also be repro with the node.js driver.



 Comments   
Comment by Githook User [ 02/Jun/22 ]

Author:

{'name': 'Cheahuychou Mao', 'email': 'mao.cheahuychou@gmail.com', 'username': 'cheahuychou'}

Message: SERVER-66777 Ensure that internal transactions do not get interrupted by logical session reaper

(cherry picked from commit 01938cf7239dc4eb6a2fa79b31743cd815d4d92c)
Branch: v6.0
https://github.com/mongodb/mongo/commit/5e72b325cd851941661565d0f43c2a4397319098

Comment by Githook User [ 02/Jun/22 ]

Author:

{'name': 'Cheahuychou Mao', 'email': 'mao.cheahuychou@gmail.com', 'username': 'cheahuychou'}

Message: SERVER-66777 Ensure that internal transactions do not get interrupted by logical session reaper
Branch: master
https://github.com/mongodb/mongo/commit/01938cf7239dc4eb6a2fa79b31743cd815d4d92c

Comment by Judah Schvimer [ 27/May/22 ]

I'm not sure if this is needed or helpful, but are there knobs we could turn to repro this a lot faster?

Generated at Thu Feb 08 06:06:24 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.