[SERVER-30644] Deadlock involving ViewCatalog mutex Created: 14/Aug/17  Updated: 27/Oct/23  Resolved: 26/Sep/19

Status: Closed
Project: Core Server
Component/s: Write Ops
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Tess Avitabile (Inactive) Assignee: Xiangyu Yao (Inactive)
Resolution: Gone away Votes: 0
Labels: query-44-grooming, read-only-views
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-25890 Prevent user-initiated writes to the ... Closed
is related to SERVER-40630 Blacklist view_catalog FSM tests from... Closed
Operating System: ALL
Steps To Reproduce:

// Hangs roughly 1 in 10 times.
(function() {
    assert.writeOK(db.coll.insert({}));
    assert.commandWorked(db.createView("view", "coll", []));
    db.system.js.drop();
 
    var join1 = startParallelShell(
            "for(var i = 0; i < 1000; i++) { assert.writeOK(db.system.views.remove({$isolated: true, $where: function () {return true;}})); print(\"Statement 1\");}");
 
    var join2 = startParallelShell(
            "for(var i = 0; i < 1000; i++) { assert.writeOK(db.coll.remove({$where: function () {return true;}})); print(\"Statement 2\");}");
 
    join1();
    join2();
})();

Sprint: Execution Team 2019-10-07
Participants:

 Description   

The repro deadlocks in the following way:

  • Statement 1 takes a lock on db.system.views in MODE_X.
  • Statement 2 parses the $where, which causes a find to be run on db.system.js. Since there is no collection called db.system.js, we check if it is a view, which takes a mutex on the ViewCatalog.
  • Statement 1 parses the $where, which causes a find to be run on db.system.js. It waits for the mutex on the ViewCatalog.
  • Statement 2 attempts to iterate the DurableViewCatalog. It waits for a MODE_IS lock on db.system.views.

The deadlock was introduced in this commit, which changed DBClientCursor to use read commands by default, instead of OP_QUERY. When using OP_QUERY, if a collection does not exist, we do not check whether it is a view.



 Comments   
Comment by Xiangyu Yao (Inactive) [ 26/Sep/19 ]

This race condition should be gone after SERVER-40992.

Statement 2 parses the $where, which causes a find to be run on db.system.js. Since there is no collection called db.system.js, we check if it is a view, which takes a mutex on the ViewCatalog.

SERVER-40992 made it so that this operation would take a MODE_IS lock on db.system.views before taking the mutex lock.

Comment by Kevin Duong [ 18/Aug/17 ]

james.wahlin – this was done and added to the future sprint.

Comment by James Wahlin [ 18/Aug/17 ]

ian.whalen - can you add this to the next sprint when created?

Generated at Thu Feb 08 04:24:32 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.