[SERVER-40535] Possibility to get a non-existent key if using ReadConcern level:local when reading signing keys in ReplicaSet Created: 08/Apr/19  Updated: 29/Oct/23  Resolved: 20/Jun/19

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: 3.6.12, 4.0.8
Fix Version/s: 4.0.11, 4.2.0-rc3, 4.3.1

Type: Bug Priority: Major - P3
Reporter: Misha Tyulenev Assignee: Misha Tyulenev
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File server-40535.diff     File test.js    
Issue Links:
Backports
Problem/Incident
causes SERVER-52955 KeysCollectionClientDirect should che... Closed
Related
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v4.2, v4.0, v3.6
Sprint: Sharding 2019-05-06, Repl 2019-06-03, Sharding 2019-06-17, Sharding 2019-07-01
Participants:
Case:

 Description   

There is a possible scenario that admin.system.keys collection gets diverged and hence customer gets a signing key that does not exists which causes errors in query processing.
The proposed fix is to use ReadConcern level:majority when reading keys



 Comments   
Comment by Danny Hatcher (Inactive) [ 22/Jul/19 ]

We have decided not to backport this ticket to 3.6. Due to the way key generation is written in 3.6, it would be a significantly larger code change to backport to that version than it was to backport to 4.0. Additionally, the fix described in this ticket only resolves scenarios in which Read Concern "Majority" is enabled.

As the driver automatically corrects the problem after receiving the error, we encourage users to retry the operation.

Comment by Githook User [ 15/Jul/19 ]

Author:

{'name': 'Misha Tyulenev', 'email': 'misha@mongodb.com', 'username': 'mikety'}

Message: SERVER-40535 follow-up fix of the failing test

(cherry picked from commit 7d88bdb226e8a3dc9b5eb4b57edcca111619c5f9)
Branch: v4.0
https://github.com/mongodb/mongo/commit/30d68a1b356c81b60b868d2917fc9d82640ecf02

Comment by Githook User [ 15/Jul/19 ]

Author:

{'name': 'Misha Tyulenev', 'email': 'misha@mongodb.com', 'username': 'mikety'}

Message: SERVER-40535 read signing keys with readConcern level majority

(cherry picked from commit 1d158cabb504fa9dba3ed0f0688cdf14cb7b0cba)
Branch: v4.0
https://github.com/mongodb/mongo/commit/50b5cbacfcde381000308c75df2971fd324009d4

Comment by Githook User [ 03/Jul/19 ]

Author:

{'name': 'Misha Tyulenev', 'username': 'mikety', 'email': 'misha@mongodb.com'}

Message: SERVER-40535 follow-up fix of the failing test

(cherry picked from commit 7d88bdb226e8a3dc9b5eb4b57edcca111619c5f9)
Branch: v4.2
https://github.com/mongodb/mongo/commit/dacd09a03b87a0dda83a5aee398f00eb295159aa

Comment by Githook User [ 03/Jul/19 ]

Author:

{'name': 'Misha Tyulenev', 'username': 'mikety', 'email': 'misha@mongodb.com'}

Message: SERVER-40535 follow-up fix of the failing test
Branch: master
https://github.com/mongodb/mongo/commit/7d88bdb226e8a3dc9b5eb4b57edcca111619c5f9

Comment by Misha Tyulenev [ 28/Jun/19 ]

mark.brinsmead when the majority reads are disabled the system can have this bug.

Comment by Githook User [ 21/Jun/19 ]

Author:

{'name': 'Misha Tyulenev', 'email': 'misha@mongodb.com', 'username': 'mikety'}

Message: SERVER-40535 read signing keys with readConcern level majority

(cherry picked from commit 1d158cabb504fa9dba3ed0f0688cdf14cb7b0cba)
Branch: v4.2
https://github.com/mongodb/mongo/commit/2d84897bb063be790a2610191f184b8f0805f595

Comment by Githook User [ 20/Jun/19 ]

Author:

{'name': 'Misha Tyulenev', 'email': 'misha@mongodb.com', 'username': 'mikety'}

Message: SERVER-40535 read signing keys with readConcern level majority
Branch: master
https://github.com/mongodb/mongo/commit/1d158cabb504fa9dba3ed0f0688cdf14cb7b0cba

Comment by Judah Schvimer [ 04/Jun/19 ]

testingSnapshotBehaviorInIsolation prevents the stableTimestamp from advancing which prevents majority reads from being available. This is correct expected behavior. I'd recommend adding a failpoint to work around this behavior or talking to the storage team about how to maintain the coverage of that test with your change.

Comment by Misha Tyulenev [ 04/Jun/19 ]

I built the testcase based on the failures in the patch run, and this is the part of this test

Comment by Judah Schvimer [ 04/Jun/19 ]

Why are you turning on testingSnapshotBehaviorInIsolation?

Comment by Misha Tyulenev [ 04/Jun/19 ]

judah.schvimer attached the git patch and the test.js. I run it in the no_passthrough suite. server-40535.diff test.js

Comment by Judah Schvimer [ 03/Jun/19 ]

misha.tyulenev, I tried reproducing this issue with the following but it did not reproduce. Can you please provide a repro script:

const rst = ReplSetTest({nodes: 1, nodeOptions: {enableMajorityReadConcern: ''}});
rst.startSet();
rst.initiate();
rst.stopSet();

Comment by Misha Tyulenev [ 29/May/19 ]

judah.schvimer to fix the issue on the ticket I need to change the read concern to be RC majority here. Once I make this change the

ReplicaSet({nodes:1, nodeOptions: { enableMajorityReadConcern: ''}});

fails to start and initiate as it hangs waiting for RC majority

Comment by Misha Tyulenev [ 24/May/19 ]

Over to repl team to investigate why readConcern majority reads are not possible once the transition to primary completed on a one node RS.

Comment by Misha Tyulenev [ 29/Apr/19 ]

ankur.raina I will be working on this fix and plan to push the changes to 3.6 within two weeks.

Comment by Misha Tyulenev [ 18/Apr/19 ]

renctan you are correct - the keyGenerator needs readConcern local or it will get stuck when trying to check for keys.

Comment by Randolph Tan [ 08/Apr/19 ]

I took a look at the code again, and it made me realize that we want read concern majority for KeysCollectionCache but local for the KeyGenerator. Currently, they both share the same opCtx so we should be careful not being contaminating the opCtx while setting the read concern.

Generated at Thu Feb 08 04:55:16 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.