[SERVER-41464] setInitialSyncFlag and clearInitialSyncFlag dbtests should not call waitUntilDurable() with lock held Created: 03/Jun/19  Updated: 29/Oct/23  Resolved: 25/Oct/19

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: None
Fix Version/s: 4.3.1

Type: Task Priority: Major - P3
Reporter: Dianna Hohensee (Inactive) Assignee: Gregory Wlodarek
Resolution: Fixed Votes: 0
Labels: former-quick-wins
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
is depended on by SERVER-39591 RecoveryUnit::waitUntilDurable() shou... Closed
Backwards Compatibility: Fully Compatible
Sprint: Execution Team 2019-11-04
Participants:

 Description   

setInitialSyncFlag and clearInitialSyncFlag both write a document and then call waitUntilDurable(). We hold a lock while calling these functions because a document write requires a lock. However, callers of waitUntilDurable() should not hold locks since significant I/O work may be done.

We wish to add an invariant to waitUntilDurable() against callers with locks in SERVER-39591. This task is a blocker.



 Comments   
Comment by Githook User [ 25/Oct/19 ]

Author:

{'username': 'GWlodarek', 'email': 'gregory.wlodarek@mongodb.com', 'name': 'Gregory Wlodarek'}

Message: SERVER-41464 setInitialSyncFlag() and clearInitialSyncFlag() should not be called while holding locks in StorageTimestampTests
Branch: master
https://github.com/mongodb/mongo/commit/ea85dc042b53aff72b2566ba9bdd0bf6c83f561b

Comment by Siyuan Zhou [ 10/Jun/19 ]

The only violation is in dbtests, where the collection lock is acquired for the whole duration of the tests. Since the collection is only used by the check in assertMinValidDocumentAtTimestamp, it's better to acquire the lock in assertMinValidDocumentAtTimestamp. We can schedule this in Quick Wins. If storage team think this is urgent and wants to take this over, feel free do so since it's pretty straightforward.

Comment by Siyuan Zhou [ 10/Jun/19 ]

Both setInitialSyncFlag() and clearInitialSyncFlag() acquire and release locks before calling waitUntilDurable(). As we discussed offline, the stack trace where the newly added invariant failed in the patch build will be helpful to understand where the lock is held unexpectedly, maybe at higher levels.

Generated at Thu Feb 08 04:57:47 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.