[SERVER-64856] Explore reusing the caller's WT_SESSION in getLatestOplogTimestamp Created: 24/Mar/22  Updated: 29/Oct/23  Resolved: 06/Jul/22

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 6.1.0-rc0

Type: Improvement Priority: Major - P3
Reporter: Josef Ahmad Assignee: Yujin Kang Park
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
is related to SERVER-61116 Audit and add assertions against usin... Backlog
Backwards Compatibility: Fully Compatible
Sprint: Execution Team 2022-05-30, Execution Team 2022-06-13, Execution Team 2022-06-27, Execution Team 2022-07-11
Participants:

 Description   

As a spin-off of SERVER-61116, we should investigate the possibility of avoiding a dedicated WT_SESSION in getLatestOplogTimestamp. Determining whether it's possible to reduce the cases where we're required to call getLatestOplogTimestamp may facilitate this task.

Pasting the relevant comments fromĀ SERVER-61116:

[...] we've identified a possible way to avoid using a second session in getLatestOplogTimestamp. Because the oplog is a logged table, and because per WT-8601 logged tables are not timestamped, we would potentially be able to reuse the existing recovery unit's session. However doing so would change some of the current visibility behaviour and include uncommitted writes on the operation context. We should determine whether there are consumers relying on this behaviour - seems that waiting for write concern could be potentially affected.

[...] daniel.gottlieb ran an experiment which didn't seem to detect any call site to surface calls sites of getLatestOplogTimestamp with an open transaction. We should investigate whether we actually need to call getLatestOplogTimestamp for write operations, as in principle writes assign a timestamp to the WUOW, so it should be sufficient for the caller to do any waiting for that time.



 Comments   
Comment by Githook User [ 05/Jul/22 ]

Author:

{'name': 'Yu Jin Kang Park', 'email': 'yujin.kang@mongodb.com', 'username': 'ykangpark'}

Message: SERVER-64856: Remove use of secondary WT_SESSION in getLatestOplogTimestamp
Branch: master
https://github.com/mongodb/mongo/commit/02dfedb849374159219251422ada6035333e2c3b

Comment by Daniel Gottlieb (Inactive) [ 24/May/22 ]

Yep that sounds accurate.

Comment by Josef Ahmad [ 24/May/22 ]

Thanks Yujin for spotting this, I've updated my comment as it was inaccurate. There are indeed getLatestOplogTimestamp call sites with an open transaction, I had a quick look at the patch and it seems quite frequent with index builds (runCreateIndexesWithCoordinator).

I believe the main goal is to determine whether it's possible for write operations to avoid calling getLatestOplogTimestamp at all, and just wait for write concern at the timestamp that was assigned in the WriteUnitOfWork. If this is possible, then we can effectively remove the conditions for SERVER-61116 to manifest for the getLatestOplogTimestamp API (that is, not write on one session then read on a new session). If this is not possible, then we should investigate whether it's safe for getLatestOplogTimestamp to just reuse the caller's WT session, specifically whether it's safe to alter the current behaviour that getLatestOplogTimestamp doesn't include uncommitted writes. daniel.gottlieb@mongodb.com as this stretches a bit beyond my domain expertise, can you confirm my understanding is accurate?

Generated at Thu Feb 08 06:01:18 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.