[SERVER-31899] Improve getTerm() performance Created: 09/Nov/17  Updated: 30/Oct/23  Resolved: 13/Nov/17

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: None
Fix Version/s: 3.6.0-rc4

Type: Improvement Priority: Major - P3
Reporter: Eric Milkie Assignee: Eric Milkie
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
is depended on by SERVER-31694 17% throughput regression in insert w... Closed
Backwards Compatibility: Fully Compatible
Sprint: Storage 2017-11-13
Participants:
Linked BF Score: 0

 Description   

An insert-heavy workload shows that the biggest bottleneck is the contention on the replication coordinator mutex between calls to getTerm() and calls to setMyLastAppliedOpTimeForward().

In 3.4, these same mutex acquisitions were there, so I'm not sure if this is a new issue. However, I do get a big boost in performance if I comment out the mutex acquisition for getTerm(), so I would like to pursue making this more efficient.



 Comments   
Comment by Githook User [ 13/Nov/17 ]

Author:

{'name': 'Eric Milkie', 'username': 'milkie', 'email': 'milkie@10gen.com'}

Message: SERVER-31899 make getTerm() use an Atomic rather than lock the replcoord mutex
Branch: master
https://github.com/mongodb/mongo/commit/9fbb130337619500dfe153bf3fc856d1852cc0ce

Comment by Eric Milkie [ 10/Nov/17 ]

More careful control of when the term gets updated is a bigger project (and might be too difficult to tackle), as acquiring a Global X lock can be disruptive in non-obvious ways. I think I will go the Atomic route first, to see how that performs.

Comment by Eric Milkie [ 10/Nov/17 ]

Ah yes, we could make the _term variable a "GM" synchronization-rule. That could work.

Comment by Andy Schwerin [ 09/Nov/17 ]

Hmmm. The calls to getTerm we're interested are for optime generation, right? So they only happen when the global lock is held in MODE_IX or MODE_X? Perhaps more careful control of when the term gets updated could allow us to call getTerm when the global lock is held in such a mode without acquiring the repl mutex.

Comment by Eric Milkie [ 09/Nov/17 ]

One idea to try is to use an Atomic for storing the term that is fetched with getTerm(). This value is written at the same time as the _term member variable (or replaces it).

Generated at Thu Feb 08 04:28:32 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.