[SERVER-30845] Avoid updating the stable timestamp in WiredTiger unnecessarily Created: 25/Aug/17  Updated: 27/Oct/23  Resolved: 28/Nov/17

Status: Closed
Project: Core Server
Component/s: Replication, Storage
Affects Version/s: None
Fix Version/s: None

Type: Task Priority: Major - P3
Reporter: William Schultz (Inactive) Assignee: Backlog - Storage Execution Team
Resolution: Works as Designed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-30843 Use std::set::upper_bound when calcul... Closed
is related to SERVER-29891 Roll Back to Checkpoint: Call setStab... Closed
Assigned Teams:
Storage Execution
Participants:
Linked BF Score: 0

 Description   

Currently, every time we update our last applied op time, we make a call to the StorageInterface::setStableTimestamp method after we compute the stable timestamp, which in turn calls into wiredTiger to inform it of the new value. This may be somewhat costly, as the storage engine may need to acquire an internal mutex every time it updates this value, so we would like to avoid these calls when possible.

One simple fix would be to keep a cached value of the stable timestamp in the storage engine glue code, and simply see if the new stable timestamp value differs from the current value. If it is the same, there is no need to make a call to the storage engine.



 Comments   
Comment by Ian Whalen (Inactive) [ 28/Nov/17 ]

Storage team doesn't believe this is currently necessary until we have greater information on perf characteristics of the new code in 3.8 and so we're resolving as Won't Fix.

Comment by Eric Milkie [ 03/Nov/17 ]

I'd like to point out that keeping a cached value in the glue code as proposed would necessarily need synchronization as well, which might be as expensive as making the call into WiredTiger any way.
It might be easier to make a design change at the source to avoid calling setStableTimestamp unnecessarily.

As an example, if I do one single write on a primary in a two-node replica set, I see that setStableTimestamp gets called twice, with the same value, on both nodes.

Comment by Spencer Brody (Inactive) [ 25/Aug/17 ]

I think it makes more sense to fix this in the storage engine glue code than in replication code, so updating this ticket to reflect that and passing off the storage team. Storage can then decide whether the cost of calling into wiredTiger is high enough to motivate this change.

Generated at Thu Feb 08 04:25:12 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.