[SERVER-44033] Only take the global write lock for applyOps when there are preConditions, multiple DDL commands or nested applyOps Created: 15/Oct/19  Updated: 10/Dec/21  Resolved: 05/Nov/19

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Task Priority: Major - P3
Reporter: Lingzhi Deng Assignee: Gregory Wlodarek
Resolution: Won't Fix Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
depends on SERVER-44389 Investigate whether the 'preCondition... Closed
depends on SERVER-44390 Investigate whether we can run one DD... Closed
Issue split
split from SERVER-43242 Deadlock involving commands acquiring... Closed
Backport Requested:
v4.2
Sprint: Execution Team 2019-11-04, Execution Team 2019-11-18
Participants:
Case:

 Description   

Remove global X lock acquisition for applyOps. This is because acquiring global lock in X mode can be blocked by prepared transactions. The enqueued global X lock can block oplog queries which need the global IS lock. If these oplog queries and the data replication are needed to satisfy the prepared transaction's write concern, then the prepare transaction and replication cannot make progress. Thus a deadlock occurs.

Alternatively, if removing global X lock is not an option, deprecate the command or make sure it won't be blocked on prepare transactions.



 Comments   
Comment by Gregory Wlodarek [ 05/Nov/19 ]

After discussing with both milkie and geert.bosch, we have come to the conclusion that the tradeoffs are too risky when trying to remove the global write lock for applyOps.

Because this is an internal command, users should not be running this during the normal operation of their database. Our tools using this command today do not have any active transactions while running the applyOps commands so we don't expect them to deadlock.

Comment by Gregory Wlodarek [ 04/Nov/19 ]

lingzhi.deng we've decided to down scope this ticket to what is currently viable while we continue to investigate the remaining work to fully remove the exclusive global lock for applyOps.

Here's our plan:

Continue to take an exclusive global lock if:

  • There are multiple DDL commands per 'applyOps'.
  • There are nested 'applyOps'.
  • There is any 'preCondition' specified.

In this ticket, we'll avoid taking an exclusive global lock if:

  • We're only running CRUD operations.
  • There is only one DDL command in the 'applyOps' command.

I've linked in the tickets which involve further investigation.

Generated at Thu Feb 08 05:04:48 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.