[SERVER-77018] Deadlock between dbStats and 2 index builds Created: 10/May/23  Updated: 29/Oct/23  Resolved: 17/May/23

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 7.0.0-rc0, 6.3.1
Fix Version/s: 7.1.0-rc0, 6.3.2, 6.0.7, 5.0.19, 7.0.0-rc2

Type: Bug Priority: Critical - P2
Reporter: Fausto Leyva (Inactive) Assignee: Yujin Kang Park
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File server-77018.repro    
Issue Links:
Backports
Depends
Documented
is documented by DOCS-16132 Investigate changes in SERVER-77018: ... Closed
Related
is related to WT-11085 Make cursors consistently return EBUS... Closed
is related to SERVER-76991 Create a "kitchen sink" suite Open
Assigned Teams:
Storage Execution
Backwards Compatibility: Minor Change
Operating System: ALL
Backport Requested:
v7.0, v6.3, v6.0, v5.0
Sprint: Execution Team 2023-05-29
Participants:

 Description   

If an on-going index build yields its locks after initiating a bulk insert (which is initialized here), it still holds onto the write lock on the index table at the WiredTiger level. If a dbStats command comes in, it will take collection level MODE_IS lock and attempt to acquire a read_lock for the ident the index build is currently writing to (but cannot since IndexBuild_1 holds the exclusive lock on that ident). (In (collection_impl.cpp) we iterate through the unfinished indexes and that is how we can see the in-progress index table).
  

The problem arises when another operation comes in and prevents IndexBuild_1 from re-acquiring its lock, like another index build that enqueues a collection MODE_X lock. These events can produce a deadlock in the system represented by:

dbStats IndexBuild_0 IndexBuild_1
[Global, DB, Coll]- MODE_IS [Global, DB] - MODE_IX  [Global, DB, Coll] - MODE_IX 
    yields MDB level locks
- holds write lock on table:index-X
blocks IndexBuild_1
- waiting on read lock of table:index-X
- holds coll lock - MODE_IS 
   
 
  • waiting for MODE_X coll lock
 
   
  • waiting to reacquire locks

 

Original explanation by Suganthi here



 Comments   
Comment by Githook User [ 13/Jun/23 ]

Author:

{'name': 'Yu Jin Kang Park', 'email': 'yujin.kang@mongodb.com', 'username': 'ykangpark'}

Message: SERVER-77018 Remove in-progress index builds from getIndexFreeStorageBytes
Branch: v5.0
https://github.com/mongodb/mongo/commit/cf9a9cd35d378aece4c5abf93fd19d82bfda568a

Comment by Githook User [ 13/Jun/23 ]

Author:

{'name': 'Yu Jin Kang Park', 'email': 'yujin.kang@mongodb.com', 'username': 'ykangpark'}

Message: SERVER-77018 Remove in-progress index builds from getIndexFreeStorageBytes
Branch: v6.0
https://github.com/mongodb/mongo/commit/da172777c1d5e54646dc0337fe6c60dbdba9af89

Comment by Githook User [ 24/May/23 ]

Author:

{'name': 'Yu Jin Kang Park', 'email': 'yujin.kang@mongodb.com', 'username': 'ykangpark'}

Message: SERVER-77018 Remove in-progress index builds from getIndexFreeStorageBytes
Branch: v6.3
https://github.com/mongodb/mongo/commit/daeee70266403433782d5377dc2a89c62d230107

Comment by Githook User [ 19/May/23 ]

Author:

{'name': 'Yu Jin Kang Park', 'email': 'yujin.kang@mongodb.com', 'username': 'ykangpark'}

Message: SERVER-77018 Remove in-progress index builds from getIndexFreeStorageBytes
Branch: v7.0
https://github.com/mongodb/mongo/commit/ade4ec067529fa461ab5189c161626b30311a75b

Comment by Yujin Kang Park [ 17/May/23 ]

Requesting backports back to v5.0. I have verified that the bug is possible up to that version.

v4.4 and older versions don't have the problematic freeStorage option, and will not deadlock.

Comment by Yujin Kang Park [ 17/May/23 ]

Fixed by removing in-progress builds from 'indexFreeStorageSize' in dbStats. Hopefully, long-term WT-11085 will improve the behaviour when both a bulk cursor and read cursor are opened concurrently, and the read cursor will consistently return EBUSY instead of sometimes blocking waiting for the lock.

Comment by Githook User [ 16/May/23 ]

Author:

{'name': 'Yu Jin Kang Park', 'email': 'yujin.kang@mongodb.com', 'username': 'ykangpark'}

Message: SERVER-77018 Remove in-progress index builds from getIndexFreeStorageBytes
Branch: master
https://github.com/mongodb/mongo/commit/f1b417202d283df43bd3f9833726097297d34f63

Comment by Louis Williams [ 16/May/23 ]

Just a note: this bug requires the caller to pass the freeStorage: true option to dbStats whose default value is 'false'. This is probably an issue that only affects Serverless, because they use this option.

Comment by Yujin Kang Park [ 15/May/23 ]

Uploading reproducer: server-77018.repro, forces the condition described above by disabling the special flag (WT_BTREE_BULK) check mentioned in the previous comment by Suganthi.

Comment by Suganthi Mani [ 15/May/23 ]

Reposting my slack comment here about WT intricacies on open cursor.

Upon WT code inspection, I found that if the index build coordinator thread tries to open the bulk cursor at the same time as dbstat tries to open the stat cursor on the index file, there are chances, we can bypass  this  if block and enter into blocking wait state.More details:  I found that dhandle  for a given URI(i.e, table) are cached in WT session cache/connection and gets shared across multiple session. This means, the index build coordinator thread and dbstat thread will be pointing to same dhandle if they are trying to open a cursor on the same index file.Generally, when a cursor is trying to open, it does the following steps

  1. Read the session cache/connection to see if the dhandle is present already. Otherwise create a new dhandle and cache it.
  2. Do some fast path checks, like,  checks if special flag (WT_BTREE_BULK)  are set in dhandle to return E_BUSY.
  3. Acquire the read or write lock depending on the operation (This a blocking call). Example, bulk cursor - gets write lock and  dbstats-> gets read lock.
  4. open the dhandle and set the appropriate flags  - Example bulk cursor, will set a special flag called WT_BTREE_BULK.

We saw “Device or resource busy” in our repro because we ran dbstat after indexbuild thread opened the bulk cursor (i.e, finished step 4) . That’s the reason, step 2 was satisfied  for the dbstat thread and returned “Device or resource busy”.

Comment by Eric Milkie [ 11/May/23 ]

My guess is that this affects 6.0 as well. This is important since we are running the free and shared tiers on 6.0 right now.

Comment by Fausto Leyva (Inactive) [ 11/May/23 ]

In both HELP tickets, we encountered this deadlock while on version 6.3.

I think it's safe to assume this is possible to hit on 7.0 since the main prerequisite for this deadlock is an index build yielding while holding onto the dhandle (of the index table it is writing to) in exclusive mode.  

Comment by Josef Ahmad [ 11/May/23 ]

What versions are affected?

Generated at Thu Feb 08 06:34:17 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.