Loading...

XML

Word

Printable

JSON

Type: Question
Resolution: Unresolved
Priority: Major - P3
Fix Version/s: None
Affects Version/s: 4.9.0-alpha4
Component/s: Storage
Labels:
None

Assigned Teams:

Replication
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

Executing the eMRCf_runner.sh tests with more than 7 growth iterations and enableMajorityReadConcern set to true results in a SIGAbort when shutting down the primary.

The test involves deliberately shutting down the only secondary in a PSA replica set with EnableMajorityReadConcern true and performing a large update heavy workload (10 growth phases involves roughly 6,000,000 updates).

In this scenario the Oplog Recovery phase takes a significant amount of time (~108 minutes):


{"t":{"$date":"2021-01-11T02:42:11.819+00:00"},"s":"I",  "c":"REPL",     "id":21545,   "ctx":"initandlisten","msg":"Starting recovery oplog application at the stable timestamp","attr":{"stableTimestamp":{"$timestamp":{"t":1610324416,"i":1}}}}

...

{"t":{"$date":"2021-01-11T04:30:53.247+00:00"},"s":"I",  "c":"REPL",     "id":21536,   "ctx":"initandlisten","msg":"Completed oplog application for recovery","attr":{"numOpsApplied":114391580,"numBatches":22879,"applyThroughOpTime":{"ts":{"$timestamp":{"t":1610331306,"i":2}},"t":2}}}

Given that this is a PSA configuration, the replica set will not be available during this recovery. Is this amount of time expected for this case?

related to

WT-7079 Long recovery time after unclean shutdown with majority and oldest timestamps held back

Closed

Assignee:: [DO NOT USE] Backlog - Replication Team
Reporter:: James O'Leary
Participants:: [DO NOT USE] Backlog - Replication Team, Bruce Lucas, James O'Leary
Votes:: 0 Vote for this issue
Watchers:: 12 Start watching this issue

Created:: Feb 10 2021 11:48:27 AM UTC
Updated:: Aug 31 2023 11:21:58 PM UTC

Details

Description

Attachments

Issue Links

Activity

People

Dates