[SERVER-31040] replace rollback time limit with operation limit Created: 11/Sep/17  Updated: 28/Mar/18  Resolved: 28/Mar/18

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Judah Schvimer Assignee: Vesselina Ratcheva (Inactive)
Resolution: Won't Fix Votes: 0
Labels: rollback-optional
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-31007 Calculate rollback time limit correctly Closed
Sprint: Repl 2018-04-09
Participants:

 Description   

We should limit rollback during common point resolution to a user-configurable operation limit.



 Comments   
Comment by Spencer Brody (Inactive) [ 28/Mar/18 ]

Closing in favor of SERVER-31007

Comment by Spencer Brody (Inactive) [ 27/Mar/18 ]

Yeah, another option to using an operation limit is to fix our calculations of the time limit. I think the right approach should be just to compare the wall clock times of the top of our oplog and the common point oplog entry, as opposed to what we currently do which is compare the top of our oplog to the top of our sync source's oplog. We should also make the time limit configurable.

This would basically mean closing this ticket and reviving SERVER-31007 instead.

We'd also have to figure out what to do if the time difference winds up negative (due to clock skew). In that case presumably we'd just ignore it and go ahead with the rollback.

Comment by Andy Schwerin [ 27/Mar/18 ]

Actually, it's to stop using a time limit, but we have wall clock times in oplog entries now separate from the table field. Why not pick a time limit using that?

Comment by Daniel Pasette (Inactive) [ 26/Mar/18 ]

Imagine a reasonably loaded primary can be handling on the order of 20-40k write ops/sec (this varies wildly depending on hardware, and type of write). 60 seconds of this kind of load may only represent a handful of megabytes of oplog. Is this to put a backstop in? Isn't this ticket all about using a time limit rather than a size-of-oplog/num-ops limit?

Comment by Daniel Gottlieb (Inactive) [ 23/Mar/18 ]

One hardcoded value spencer wanted to make sure was handled when this ticket gets worked on.

Comment by Spencer Brody (Inactive) [ 23/Mar/18 ]

alyson.cabral, would like your input on what a good default value for the number of operations to limit rollbacks to should be.

I propose 1 million

Generated at Thu Feb 08 04:25:49 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.