[SERVER-45810] Explore removing the WiredTigerOplogManager via refactor Created: 28/Jan/20  Updated: 03/Apr/20  Resolved: 01/Apr/20

Status: Closed
Project: Core Server
Component/s: Storage
Affects Version/s: None
Fix Version/s: None

Type: Task Priority: Major - P3
Reporter: Dianna Hohensee (Inactive) Assignee: Dianna Hohensee (Inactive)
Resolution: Won't Do Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Related
is related to SERVER-45025 Remove WiredTigerOplogManager thread Closed
is related to SERVER-47258 Refactor the WiredTigerOplogManager Closed
Sprint: Execution Team 2020-04-06
Participants:

 Description   

If we do not remove the WiredTigerOplogManager, then should file a ticket to refactor the naming of things – e.g. _oplogJournalThreadLoop -> _updateOplogReadTimestampLoop, triggerJournalFlush -> triggerOplogReadTimestampUpdate (remove reference to Journal after the replicate before journaling project is complete and the waitUntilDurable call is removed).



 Comments   
Comment by Dianna Hohensee (Inactive) [ 01/Apr/20 ]

Removing the OplogManager thread could cause performance repercussions.

The OplogManager thread runs updates of the oplogReadTimestamp whenever it is signaled by another thread – like in _txnClose after we've committed an unordered write on a primary. The OplogManager is therefore batching callers; and the callers do not wait for the update to occur, only signal the OplogManager thread that it should do an update.

Eliminating the OplogManager thread would serialize unordered write operations behind updating the oplogReadTimestamp. We could create a barrier, after which a caller A checks if some other caller already did the update since caller A started waiting to do the update and leave early instead. However, given our experience with performance in other areas of the code, this has greater odds (though unknown without doing) of causing a performance decrease. The waitUntilDurable() logic has such a barrier, and after moving waitForWriteConcern callers onto the JournalFlusher thread to batch callers like the OplogManager does, performance increased: the waitUntilDurable() barrier was not as good as the async batching.

-----------------------------------------------------------------------------------------------------------------
My exploration of the OplogManager code and from where it is called and what it uses leads to me think we would be better off keeping a separate class, even if we were to eliminate the async thread.

As far as the rest of the OplogManager logic goes:

  • Currently the WiredTigerRecoveryUnit and WiredTigerKVEngine both keep a ptr to the OplogManager.
  • The WiredTigerRecord store holds a ptr to the WiredTigerKVEngine, on which it calls getOplogManager() to access the OplogManager itself.

Without the OplogManager, the following logic would most likely move to the WiredTigerRecoveryUnit, given

  • static fetchAllDurableValue(WT_CONNECTION* conn)
  • static setOplogReadTimestamp(Timestamp ts)
  • static getOplogReadTimestamp()
  • waitForAllEarlierOpogWritesToBeVisible()
  • static oplogReadTimestamp state
    And the WiredTigerKVEngine would need a oplog WiredTigerRecordStore* instead of a WiredTigerOplogManager*.
    Alternatively, the WiredTigerKVEngine only needs the OplogManager to call fetchAllDurableValue, so instead of keeping a WiredTigerRecordStore*, WiredTigerKVEngine could reimplement fetchAllDurableValue (in addition to needing an implementation in the WiredTigerRecoveryUnit).

Lastly, the WiredTigerRecordStore controls starting and stopping the OplogManager when the oplog collection is created; and the WiredTigerKVEngine implements the start/halt functions, which the WiredTigerRecordStore calls, to tell the OplogManager to start/halt. The WiredTigerRecordStore only keeps a WiredTigerKVEngine* to access the OplogManager, the KVEngine is used for nothing else.

Generated at Thu Feb 08 05:09:46 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.