[DOCS-12887] Investigate changes in SERVER-41261: Use the oplog entry after the common point to calculate rollbackTimeLimitSecs Created: 12/Jul/19  Updated: 13/Nov/23  Resolved: 20/Aug/19

Status: Closed
Project: Documentation
Component/s: manual, Server
Affects Version/s: None
Fix Version/s: 4.3.1, 4.2.0-rc3, 4.0.13, Server_Docs_20231030, Server_Docs_20231106, Server_Docs_20231105, Server_Docs_20231113

Type: Task Priority: Major - P3
Reporter: Backlog - Core Eng Program Management Team Assignee: Kay Kim (Inactive)
Resolution: Fixed Votes: 0
Labels: docs-backport-done
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Documented
documents SERVER-41261 Use the oplog entry after the common ... Closed
Participants:
Days since reply: 4 years, 16 weeks, 5 days ago
Epic Link: DOCS: 4.2 Server/Tools

 Description   

Description

SERVER ticket description:

In Atlas you can pause a cluster, effectively shutting the nodes down for a period of time.

Let's assume we pause for more than 24 hours and that all the nodes are current having committed all the writes. When they are restarted at the same time, we are seeing two nodes run and two branches of history forming. Eventually, one goes into rollback and gets a fassert because the common point is more than 24 hours behind even though we are only rolling back 1 or 2 very recent oplog entries. The common point, in this case, is from over 24 hours ago where the oplog entry immediately after the common point is from less than 5 mins ago.

While we believe we are fixing the two nodes running at the same time problem via SERVER-40336, it still makes sense to change this calculation if true network partitions occur after unpausing. Resolving this manually is a headache.

Change Description:

The rollback time limit is no longer calculated between the top of the oplog and the common point but rather it is now between the top of the oplog and the first operation after the common point. The time limit is still 24 hours.

Scope of changes

Impact to Other Docs

MVP (Work and Date)

Resources (Scope or Design Docs, Invision, etc.)



 Comments   
Comment by Githook User [ 18/Oct/19 ]

Author:

{'username': 'kay-kim', 'email': 'kay.kim@10gen.com', 'name': 'Kay Kim'}

Message: DOCS-12887: 4.0.13 rollback time limit calc change
Branch: v4.0
https://github.com/mongodb/docs/commit/2aa0fc12a65786fc0a7489a7e941b20f4be04fbc

Comment by Githook User [ 20/Aug/19 ]

Author:

{'name': 'Kay Kim', 'email': 'kay.kim@10gen.com', 'username': 'kay-kim'}

Message: DOCS-12887: tweak rollback calc for backport to 4.0
Branch: master
https://github.com/mongodb/docs/commit/6dacd0b313a12a7d489a0c9faff464ec1d765c64

Comment by Githook User [ 20/Aug/19 ]

Author:

{'name': 'Kay Kim', 'email': 'kay.kim@10gen.com', 'username': 'kay-kim'}

Message: DOCS-12887: 4.0.13 rollback time limit calc change
Branch: v4.0.13
https://github.com/mongodb/docs/commit/949d783e420e36ec2312d3d8405b745174cc436d

Comment by Kay Kim (Inactive) [ 19/Aug/19 ]

reopening for backport

Comment by Githook User [ 07/Aug/19 ]

Author:

{'name': 'Kay Kim', 'username': 'kay-kim', 'email': 'kay.kim@10gen.com'}

Message: DOCS-12887: 4.2 rollback time limit calc change
Branch: master
https://github.com/mongodb/docs/commit/735f74cbf1455bb616f862154440ed7162688fa3

Generated at Thu Feb 08 08:06:22 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.