[SERVER-45810] Explore removing the WiredTigerOplogManager via refactor Created: 28/Jan/20 Updated: 03/Apr/20 Resolved: 01/Apr/20 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Storage |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Task | Priority: | Major - P3 |
| Reporter: | Dianna Hohensee (Inactive) | Assignee: | Dianna Hohensee (Inactive) |
| Resolution: | Won't Do | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||
| Sprint: | Execution Team 2020-04-06 | ||||||||||||||||
| Participants: | |||||||||||||||||
| Description |
|
If we do not remove the WiredTigerOplogManager, then should file a ticket to refactor the naming of things – e.g. _oplogJournalThreadLoop -> _updateOplogReadTimestampLoop, triggerJournalFlush -> triggerOplogReadTimestampUpdate (remove reference to Journal after the replicate before journaling project is complete and the waitUntilDurable call is removed). |
| Comments |
| Comment by Dianna Hohensee (Inactive) [ 01/Apr/20 ] |
|
Removing the OplogManager thread could cause performance repercussions. The OplogManager thread runs updates of the oplogReadTimestamp whenever it is signaled by another thread – like in _txnClose after we've committed an unordered write on a primary. The OplogManager is therefore batching callers; and the callers do not wait for the update to occur, only signal the OplogManager thread that it should do an update. Eliminating the OplogManager thread would serialize unordered write operations behind updating the oplogReadTimestamp. We could create a barrier, after which a caller A checks if some other caller already did the update since caller A started waiting to do the update and leave early instead. However, given our experience with performance in other areas of the code, this has greater odds (though unknown without doing) of causing a performance decrease. The waitUntilDurable() logic has such a barrier, and after moving waitForWriteConcern callers onto the JournalFlusher thread to batch callers like the OplogManager does, performance increased: the waitUntilDurable() barrier was not as good as the async batching. ----------------------------------------------------------------------------------------------------------------- As far as the rest of the OplogManager logic goes:
Without the OplogManager, the following logic would most likely move to the WiredTigerRecoveryUnit, given
Lastly, the WiredTigerRecordStore controls starting and stopping the OplogManager when the oplog collection is created; and the WiredTigerKVEngine implements the start/halt functions, which the WiredTigerRecordStore calls, to tell the OplogManager to start/halt. The WiredTigerRecordStore only keeps a WiredTigerKVEngine* to access the OplogManager, the KVEngine is used for nothing else. |