[SERVER-35367] Hold locks in fewer callers of waitForAllEarlierOplogWritesToBeVisible() Created: 01/Jun/18 Updated: 29/Oct/23 Resolved: 15/Aug/18 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Storage |
| Affects Version/s: | 4.0.0-rc1, 4.1.1 |
| Fix Version/s: | 4.0.2, 4.1.2 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Louis Williams | Assignee: | Spencer Brody (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | SWNA, nyc | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||||||||||||||||||||||
| Operating System: | ALL | ||||||||||||||||||||||||||||||||||||||||
| Backport Requested: |
v4.0, v3.6
|
||||||||||||||||||||||||||||||||||||||||
| Sprint: | Repl 2018-07-30, Repl 2018-08-13, Repl 2018-08-27 | ||||||||||||||||||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||||||||||||||||||
| Case: | (copied to CRM) | ||||||||||||||||||||||||||||||||||||||||
| Linked BF Score: | 36 | ||||||||||||||||||||||||||||||||||||||||
| Description |
|
ReplicationCoordinatorExternalStateImpl::waitForAllEarlierOplogWritesToBeVisible() holds a collection lock on the oplog while doing a blocking wait. This can cause a hang described below: 1. First, perform an insert into a replicated collection using insertDocuments(). An optime is generated, but not committed. If another write occurs after this at a later optime, a "hole" is created by the timestamped write is that is not yet committed. 2. A reader using readConcern "atClusterTime" or "afterClusterTime" begins a read. This uses ReplicationCoordinatorExternalStateImpl::waitForAllEarlierOplogWritesToBeVisible() to wait for all uncommitted operations to become committed and visible.
3. A dropCollection command is received on the "local" database, and enqueues a DBLock("local", MODE_X). 4. The first insert completes the insert in the storage engine and attempts to write the oplog entry at the generated optime. It attempts to acquire a DBLock("local", MODE_IX).
This method should be redesigned so that a collection lock is not required to be held while waiting for the last oplog entry to become visible. |
| Comments |
| Comment by Githook User [ 16/Aug/18 ] |
|
Author: {'username': 'stbrody', 'email': 'spencer@mongodb.com', 'name': 'Spencer T Brody'}Message: (cherry picked from commit a1b225bcf0e9791b14649df385b3f3f9710a98ab) |
| Comment by Spencer Brody (Inactive) [ 15/Aug/18 ] |
|
Updating the title of this ticket to reflect the work that was actually done. While some callers of waitForAllEarlierOplogWritesToBeVisible no longer hold locks on the oplog, there are still a few that do. We'd like to get rid of all of such places, but that is a larger change than we are ready to make at the moment. In the meantime we hope to address the deadlock described in this ticket via |
| Comment by Spencer Brody (Inactive) [ 07/Aug/18 ] |
|
|
| Comment by Spencer Brody (Inactive) [ 03/Aug/18 ] |
|
My previous fix is incomplete, it doesn't address oplog queries used for replication that call into waitForEarlierOplogWritesToBeVisible while holding locks on the oplog via the normal query execution machinery. |
| Comment by Githook User [ 31/Jul/18 ] |
|
Author: {'name': 'Spencer T Brody', 'email': 'spencer@mongodb.com', 'username': 'stbrody'}Message: |
| Comment by Eric Milkie [ 03/Jul/18 ] |
|
I checked and all the code that participates in this bug is present in 3.6 so I assume it does exist there as well. |
| Comment by Tess Avitabile (Inactive) [ 03/Jul/18 ] |
|
milkie, do you know if this bug exists on 3.6? |