[SERVER-36534] Don't acquire locks on oplog when writing oplog entries Created: 08/Aug/18  Updated: 29/Oct/23  Resolved: 24/Aug/18

Status: Closed
Project: Core Server
Component/s: Replication, Storage
Affects Version/s: None
Fix Version/s: 4.0.4, 4.1.3

Type: Task Priority: Major - P3
Reporter: Spencer Brody (Inactive) Assignee: Eric Milkie
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
Problem/Incident
Related
related to SERVER-35367 Hold locks in fewer callers of waitFo... Closed
related to SERVER-36883 support non-doc-locking storage engin... Closed
related to SERVER-40498 Writing transaction oplog entries mus... Closed
related to SERVER-36514 Hold lock on oplog as soon as optime ... Closed
Backwards Compatibility: Fully Compatible
Backport Requested:
v4.0, v3.6
Sprint: Storage NYC 2018-08-27
Participants:
Case:
Linked BF Score: 36

 Description   

Since the oplog can never be dropped, there's no need to hold an IX lock on the oplog when writing into it.



 Comments   
Comment by Githook User [ 30/Oct/18 ]

Author:

{'name': 'Eric Milkie', 'email': 'milkie@10gen.com', 'username': 'milkie'}

Message: SERVER-36534 don't acquire locks on oplog when writing oplog entries

(cherry picked from commit 5c1a3ec728a71bca81629f99be782ac305a6ad4b)
Branch: v4.0
https://github.com/mongodb/mongo/commit/55aae79567fd79883c8946b89132a3bbb861fe4c

Comment by Githook User [ 24/Aug/18 ]

Author:

{'name': 'Eric Milkie', 'email': 'milkie@10gen.com', 'username': 'milkie'}

Message: SERVER-36534 don't acquire locks on oplog when writing oplog entries
Branch: master
https://github.com/mongodb/mongo/commit/5c1a3ec728a71bca81629f99be782ac305a6ad4b

Comment by Michael Cahill (Inactive) [ 17/Aug/18 ]

I spent some time trying to reproduce this and ran into the problem that calling WT's verify method on the oplog always fails with EBUSY. That's because having a session open is preventing the stable timestamp from catching up with current. After closing all sessions and waiting for the logical session reaper to run, the stable timestamp does catch up. The verify still doesn't succeed (it hits a different EBUSY): I'll chase that some more next week.

Once we can reproduce the symptom more easily, here is how I'd expect to catch it quickly:

diff --git a/src/third_party/wiredtiger/src/cursor/cur_file.c b/src/third_party/wiredtiger/src/cursor/cur_file.c
index 1c3fcc2949..14a47996ce 100644
--- a/src/third_party/wiredtiger/src/cursor/cur_file.c
+++ b/src/third_party/wiredtiger/src/cursor/cur_file.c
@@ -585,6 +585,8 @@ __curfile_reopen(WT_CURSOR *cursor, bool check_only)
 			WT_ASSERT(session,
 			    dhandle->type == WT_DHANDLE_TYPE_BTREE);
 			cbt->btree = dhandle->handle;
+			WT_ASSERT(session, !F_ISSET(dhandle, WT_DHANDLE_EXCLUSIVE));
+			WT_ASSERT(session, !F_ISSET(cbt->btree, WT_BTREE_SPECIAL_FLAGS));
 			cursor->internal_uri = cbt->btree->dhandle->name;
 			cursor->key_format = cbt->btree->key_format;
 			cursor->value_format = cbt->btree->value_format;

Comment by Eric Milkie [ 15/Aug/18 ]

In testing for this, I'm hitting a problem with either cursor caching (in WiredTiger) or in verify. Continuing to pursue the issues. I'm hopeful that once they are resolved, we'll be able to push this code change and resolve the flavor of deadlock described in SERVER-35367.

Comment by Spencer Brody (Inactive) [ 08/Aug/18 ]

Per conversation with milkie I am passing this off to the storage team. The core code change is small, but investigating the build failures that fall out from it requires expertise in the storage subsystem.

Comment by Spencer Brody (Inactive) [ 08/Aug/18 ]

This is another potential fix for the deadlock described in SERVER-36514 and SERVER-35367

Generated at Thu Feb 08 04:43:23 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.