[SERVER-62650] RecordStore RecordId initialization can deadlock transactions with cache eviction Created: 14/Jan/22  Updated: 29/Oct/23  Resolved: 04/Feb/22

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 5.2.0, 5.1.2, 5.0.6
Fix Version/s: 5.3.0, 5.2.1, 5.0.7

Type: Bug Priority: Major - P3
Reporter: Louis Williams Assignee: Louis Williams
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
Problem/Incident
is caused by SERVER-58409 Startup RecordId initialization is fl... Closed
Related
related to SERVER-60839 Introduce a TemporarilyUnavailable er... Closed
related to SERVER-61116 Audit and add assertions against usin... Backlog
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v5.2, v5.0, v4.4
Sprint: Execution Team 2022-02-07, Execution Team 2022-02-21
Participants:
Linked BF Score: 155

 Description   

There is a bug with our RecordId initialization that is more generally described by SERVER-61116. As a consequence, very large multi-document transactions that consume most of cache can deadlock. In practice, this has to be the first transaction to write to a given collection.

We create a new WT_SESSION to call largest_key() to lazily initialize the highest RecordId for a collection (as of SERVER-58409). We can do this while holding hostage another session that is pinning a large amount of data in the cache. If this large transaction is pinning enough data, then the largest_key() call can block, but the session pinning that content cannot be rolled-back because it is held by the same thread.

We should use an "operation_timeout_ms" here, as we did in SERVER-61097. This will cause the operation to receive a WT_ROLLBACK after a period of time, which we should throw back to the parent operation to retry.



 Comments   
Comment by Githook User [ 23/Mar/22 ]

Author:

{'name': 'Louis Williams', 'email': 'louis.williams@mongodb.com', 'username': 'louiswilliams'}

Message: SERVER-62650 Limit cache wait time when initializing RecordIds

(cherry picked from commit 14b9051a791865503f3b101a62c0903f5c15a4a8)
(cherry picked from commit 041dfbb36dddce27c8ef96cecb8ec259ca8f5054)
Branch: v5.0
https://github.com/mongodb/mongo/commit/e66d22908deffd85d31672456b934224b16811ee

Comment by Githook User [ 08/Feb/22 ]

Author:

{'name': 'Louis Williams', 'email': 'louis.williams@mongodb.com', 'username': 'louiswilliams'}

Message: SERVER-62650 Limit cache wait time when initializing RecordIds

(cherry picked from commit 14b9051a791865503f3b101a62c0903f5c15a4a8)
Branch: v5.2
https://github.com/mongodb/mongo/commit/041dfbb36dddce27c8ef96cecb8ec259ca8f5054

Comment by Githook User [ 04/Feb/22 ]

Author:

{'name': 'Louis Williams', 'email': 'louis.williams@mongodb.com', 'username': 'louiswilliams'}

Message: SERVER-62650 Limit cache wait time when initializing RecordIds
Branch: master
https://github.com/mongodb/mongo/commit/14b9051a791865503f3b101a62c0903f5c15a4a8

Comment by Githook User [ 02/Feb/22 ]

Author:

{'name': 'Louis Williams', 'email': 'louis.williams@mongodb.com', 'username': 'louiswilliams'}

Message: Revert "SERVER-62650 Limit cache wait time when initializing RecordIds"

This reverts commit 6d509615d2d6ef7af38e1b982b6272a54e9b591c.
Branch: master
https://github.com/mongodb/mongo/commit/c45f8885f8c2aa87a8498add9228969600600de0

Comment by Githook User [ 01/Feb/22 ]

Author:

{'name': 'Louis Williams', 'email': 'louis.williams@mongodb.com', 'username': 'louiswilliams'}

Message: SERVER-62650 Limit cache wait time when initializing RecordIds
Branch: master
https://github.com/mongodb/mongo/commit/6d509615d2d6ef7af38e1b982b6272a54e9b591c

Comment by Louis Williams [ 24/Jan/22 ]

Actually, the more I think about it, SERVER-60839 won't help. The test will still occasionally fail in this passthrough suite that fuzzes the WT eviction settings. So I think the work in this ticket will be to allow this passthrough to fail with a WriteConflict when this does happen. I'm going to leave a TODO for SERVER-60839 to replace the usage of WriteConflict with the new error code.

Comment by Louis Williams [ 24/Jan/22 ]

Since this is a problem in a single-transaction environment with a small dirty cache, if we choose to throw a WriteConflictException in this scenario, the results are going to look very similar to SERVER-60839. If the caller doesn't handle the WCE, the test will fail, and if they do handle the error, they will just retry forever.

Instead of a write conflict, we should return the same error that we return in SERVER-60839. I'm marking this ticket as dependent on that.

One note: it's not clear that we can safely backport SERVER-60839 to 5.0 (or even 4.4) without potentially breaking application behavior, but if we can't, we'll have to revisit how to fix this in older branches.

Generated at Thu Feb 08 05:55:42 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.