[SERVER-27831] Deadlock when listing collections on "local" database with replication enabled for KVCatalog-based storage engines without document locking Created: 27/Jan/17 Updated: 06/Dec/17 Resolved: 29/Mar/17 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Storage |
| Affects Version/s: | None |
| Fix Version/s: | 3.4.7, 3.5.6 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Max Hirschhorn | Assignee: | Daniel Gottlieb (Inactive) |
| Resolution: | Done | Votes: | 0 |
| Labels: | bkp, disabled-test, todo_in_code | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||
| Operating System: | ALL | ||||||||||||||||||||
| Backport Requested: |
v3.4
|
||||||||||||||||||||
| Steps To Reproduce: | Apply the following patch and run the create_database.js FSM workload against the ephemeralForTest storage engine.
|
||||||||||||||||||||
| Sprint: | Storage 2017-03-27, Storage 2017-04-17 | ||||||||||||||||||||
| Participants: | |||||||||||||||||||||
| Linked BF Score: | 0 | ||||||||||||||||||||
| Description |
|
Storage engines which use the KVCatalog, but do not support document-level concurrency (such as the ephemeralForTest storage engine) may experience deadlock because of the incompatible acquisition order of the resourceIdCatalogMetadata and the "local" database locks. This bug does not apply to the MMAPv1 or WiredTiger storage engines.
The resourceIdCatalogMetadata lock is acquired when calling KVCatalog::newCollection() as part of Database::createCollection(); however, the lock doesn't get released immediately after returning from there since it was acquired in MODE_X inside a WriteUnitOfWork causing shouldDelayUnlock() to return true. This leads to the client creating a collection to be holding the resourceIdCatalogMetadata lock while attempting to acquire a lock on the "local" database in order to write the corresponding oplog entry at the same time as the client listing collections on the "local" database to be holding a lock on the "local" database while attempting to acquire the resourceIdCatalogMetadata lock to get the individual collection options. git version: ae04822985f2478c7da1e6821f5fc91b484b9555 |
| Comments |
| Comment by Githook User [ 05/Jul/17 ] |
|
Author: {u'username': u'dgottlieb', u'name': u'Daniel Gottlieb', u'email': u'daniel.gottlieb@mongodb.com'}Message: RecordStores that don't implement document level locking are typically This patch forces the thread-safety requirement into the RecordStore. (cherry picked from commit 71a149b45c8bb019cbc8179f4a411be66bda2062) |
| Comment by Githook User [ 29/Mar/17 ] |
|
Author: {u'username': u'dgottlieb', u'name': u'Daniel Gottlieb', u'email': u'daniel.gottlieb@10gen.com'}Message: RecordStores that don't implement document level locking are typically This patch forces the thread-safety requirement into the RecordStore. |
| Comment by Eric Milkie [ 28/Feb/17 ] |
|
It's also possible to deadlock createCollection with itself, when creating collections in the local database, for the same reason that listing collections in the local database can trigger a deadlock: both operations hold the local database lock while trying to lock the resourceIdCatalogMetadata lock. |