Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Gone away
Priority: Major - P3
Fix Version/s: None
Affects Version/s: None
Component/s: Replication
Labels:
None

Assigned Teams:

Replication
Operating System:
ALL
Confidence Status:
None
Work Order:
3
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

While working on ~~SERVER-51387~~, to assert that the stable timestamp is never set higher than the all durable timestamp, I ran into a problem with aborting in-progress transactions on step-up with eMRC=off on this test.

While aborting the in-progress transactions, the stable timestamp is always being set higher than the all durable timestamp by one.

TXN      [OplogApplier-0] Aborting in-progress transactions on stepup.
TXN      [OplogApplier-0] New transaction started {"txnNumber":0,"lsid":{"uuid":{"$uuid":"b7c9b1d4-1883-46c5-b73d-d235c3d41623"}}}
TXN      [OplogApplier-0] Aborting transaction {"sessionId":{"id":{"$uuid":"b7c9b1d4-1883-46c5-b73d-d235c3d41623"},"uid":{"$binary":{"base64":"47DEQpj8HBSa+/TImW+5JCeuQeRkm5NMpJWZG3hSuFU=","subType":"0"}}},"txnNumber":0}
TXN      [OplogApplier-0] transaction {"parameters":{"lsid":{"id":{"$uuid":"b7c9b1d4-1883-46c5-b73d-d235c3d41623"},"uid":{"$binary":{"base64":"47DEQpj8HBSa+/TImW+5JCeuQeRkm5NMpJWZG3hSuFU=","subType":"0"}}},"txnNumber":0,"autocommit":false,"readConcern":{"provenance":"clientSupplied"}},"readTimestamp":"Timestamp(0, 0)","terminationCause":"aborted","timeActiveMicros":0,"timeInactiveMicros":379,"numYields":0,"locks":{"ParallelBatchWriterMode":{"acquireCount":{"r":3}},"ReplicationStateTransition":{"acquireCount":{"w":4,"W":1}},"Global":{"acquireCount":{"r":1,"w":3}},"Database":{"acquireCount":{"r":1,"W":1}},"Collection":{"acquireCount":{"r":1}},"Mutex":{"acquireCount":{"r":2}}},"storage":{},"wasPrepared":false,"durationMillis":0}
REPL     [OplogApplier-0] Setting replication's stable optime {"stableOpTime":{"ts":{"$timestamp":{"t":1604518513,"i":4}},"t":2}}
STORAGE  [OplogApplier-0] The stable timestamp was greater than the all durable timestamp {"stableTimestamp":{"$timestamp":{"t":1604518513,"i":4}},"allDurableTimestamp":{"$timestamp":{"t":1604518513,"i":3}}}
-        [OplogApplier-0] Fatal assertion {"msgid":5138700,"file":"src/mongo/db/storage/wiredtiger/wiredtiger_kv_engine.cpp","line":1925}

I see that when we recalculate the stable timestamp, we only take the all durable timestamp into consideration if the node canAcceptNonLocalWrites(). However, because the node is still stepping up the canAcceptNonLocalWrites() flag wasn't updated yet. That flag gets updated once the state transition is complete.

I believe this works for eMRC=on today because we use the commit point instead of the last applied, which in my testing was less than the all durable timestamp.

related to

SERVER-52956 Add storage debug method to dump system-wide RecoveryUnit/transaction state

Closed

SERVER-51387 Assert that the stable timestamp is never set higher than the WT all_durable timestamp

Closed

Assignee:: [DO NOT USE] Backlog - Replication Team
Reporter:: Gregory Wlodarek
Participants:: [DO NOT USE] Backlog - Replication Team, Gregory Wlodarek
Votes:: 0 Vote for this issue
Watchers:: 12 Start watching this issue

Created:: Nov 04 2020 07:37:47 PM UTC
Updated:: Sep 10 2024 06:30:32 PM UTC
Resolved:: Sep 10 2024 06:30:32 PM UTC

Details

Description

Attachments

Issue Links

Forms

Activity

People

Dates