[SERVER-66145] Identify and fix locations that write while holding read tickets Created: 03/May/22  Updated: 29/Oct/23  Resolved: 16/Dec/22

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 6.3.0-rc0

Type: Bug Priority: Major - P3
Reporter: Louis Williams Assignee: Louis Williams
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
depends on SERVER-67695 Remove the write-blocking mode of dbC... Closed
Related
related to SERVER-60621 Investigate if we can ban upgrading t... Closed
is related to SERVER-66719 dbCheck FCV lock upgrade causes deadl... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: Execution Team 2022-12-12, Execution Team 2022-11-28, Execution Team 2022-12-26
Participants:
Linked BF Score: 169

 Description   

Due to our support of recursive locking, it is possible for an operation to hold a global IS, acquiring a read ticket, and then obtain an IX lock at a lower level without taking a write ticket.

This test passes:

// d_concurrency_test.cpp
 
TEST_F(DConcurrencyTestFixture, WrongTicket) {
    auto clientOpctxPairs = makeKClientsWithLockers(1);
    auto opctx1 = clientOpctxPairs[0].second.get();
 
    Lock::GlobalLock read(opctx1, MODE_IS);
    ASSERT_TRUE(opctx1->lockState()->hasReadTicket());
    ASSERT_FALSE(opctx1->lockState()->hasWriteTicket());
    ASSERT_FALSE(opctx1->lockState()->isWriteLocked());
 
    Lock::GlobalLock write(opctx1, MODE_IX);
 
    ASSERT_TRUE(opctx1->lockState()->hasReadTicket());
    ASSERT_FALSE(opctx1->lockState()->hasWriteTicket());
    ASSERT_TRUE(opctx1->lockState()->isWriteLocked());
}

We can discover violations of this with the following assertion:

diff --git a/src/mongo/db/concurrency/lock_state.cpp b/src/mongo/db/concurrency/lock_state.cpp
index a968d2eee6e..2d22ebf6790 100644
--- a/src/mongo/db/concurrency/lock_state.cpp
+++ b/src/mongo/db/concurrency/lock_state.cpp
@@ -409,6 +409,11 @@ void LockerImpl::lockGlobal(OperationContext* opCtx, LockMode mode, Date_t deadl
                     _acquireTicket(opCtx, mode, deadline));
         }
         _modeForTicket = mode;
+    } else if (!isModeCovered(mode, _modeForTicket)) {
+        LOGV2_FATAL(99999,
+                    "Ticket held does not cover requested mode for global lock",
+                    "held"_attr = _modeForTicket,
+                    "requested"_attr = mode);
     }

This is mostly an accounting problem since we don't use tickets to block reads or writes. We only use tickets for throttling reads and writes, and most applications do not hit this limit.



 Comments   
Comment by Githook User [ 16/Dec/22 ]

Author:

{'name': 'Louis Williams', 'email': 'louis.williams@mongodb.com', 'username': 'louiswilliams'}

Message: SERVER-66145 Prevent operations from writing with read tickets

Since we do not support ticket upgrades, this effectively bans global lock upgrades
Branch: master
https://github.com/mongodb/mongo/commit/11b5cafdb964e68eb892e413ee893bdb20a7574e

Comment by Louis Williams [ 30/Jun/22 ]

The only remaining place where we do lock upgrades appears to be dbCheck. There is a mode of dbCheck that uses an S lock and acquires an IX lock to write the oplog entry. I believe we should first remove support for that mode of dbCheck. I opened SERVER-67695 for that work.

Comment by Louis Williams [ 03/May/22 ]

If we banned lock upgrades, as is suggested by SERVER-60621, I believe that would effectively solve this problem.

Generated at Thu Feb 08 06:04:37 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.