Loading...

XML

Word

Printable

JSON

Type: Question
Resolution: Unresolved
Priority: Major - P3
Fix Version/s: None
Affects Version/s: 4.9.0-alpha4
Component/s: Storage
Labels:
None

Assigned Teams:

Replication

Executing the eMRCf_runner.sh tests with more than 7 growth iterations and enableMajorityReadConcern set to true results in a SIGAbort when shutting down the primary.

The test involves deliberately shutting down the only secondary in a PSA replica set with EnableMajorityReadConcern true and performing a large update heavy workload (10 growth phases involves roughly 6,000,000 updates).

In this scenario the Oplog Recovery phase takes a significant amount of time (~108 minutes):


{"t":{"$date":"2021-01-11T02:42:11.819+00:00"},"s":"I",  "c":"REPL",     "id":21545,   "ctx":"initandlisten","msg":"Starting recovery oplog application at the stable timestamp","attr":{"stableTimestamp":{"$timestamp":{"t":1610324416,"i":1}}}}

...

{"t":{"$date":"2021-01-11T04:30:53.247+00:00"},"s":"I",  "c":"REPL",     "id":21536,   "ctx":"initandlisten","msg":"Completed oplog application for recovery","attr":{"numOpsApplied":114391580,"numBatches":22879,"applyThroughOpTime":{"ts":{"$timestamp":{"t":1610331306,"i":2}},"t":2}}}

Given that this is a PSA configuration, the replica set will not be available during this recovery. Is this amount of time expected for this case?

related to

WT-7079 Long recovery time after unclean shutdown with majority and oldest timestamps held back

Closed

Assignee:: [DO NOT USE] Backlog - Replication Team

Reporter:: James O'Leary

Participants:: [DO NOT USE] Backlog - Replication Team, Bruce Lucas, James O'Leary

Votes:: 0 Vote for this issue

Watchers:: 12 Start watching this issue

Created:: Feb 10 2021 11:48:27 AM UTC

Updated:: Aug 31 2023 11:21:58 PM UTC

Details

Description

Attachments

Issue Links

Activity

People

Dates