Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Fixed
Priority: Critical - P2
Fix Version/s: WT11.0.0, 4.4.14, 5.3.0-rc4, 6.0.0-rc0, 5.0.8
Affects Version/s: None
Component/s: None
Labels:
- bug-classification-activity-phase-2
- group-b

Sprint:
Storage - Ra 2022-03-21
Story Points:
5
Case:

Backport Requested:

v5.3, v5.0, v4.4

Issue Status as of May 2, 2022

ISSUE DESCRIPTION AND IMPACT
This issue in MongoDB 4.4.10 to 4.4.13 and 5.0.4 to 5.0.7 may cause replication to stall on secondary replica set members in a sharded cluster handling cross-shard transactions.

The bug is triggered when WiredTiger erroneously returns a write conflict when deciding if an update to a record is allowed. If MongoDB decides to retry the operation that caused the conflict in WiredTiger, it will enter an indefinite retry loop, and oplog application will stall on secondary nodes.

DIAGNOSIS
A MongoDB cluster may be affected by this bug if:

the cluster is sharded
the application uses cross-shard transactions
the cluster is using versions 4.4.10 to 4.4.13 or 5.0.4 to 5.0.7 on secondary nodes

If the bug is triggered, the cluster's secondary nodes will experience indefinite growth in replication lag.

REMEDIATION AND WORKAROUNDS
Secondary nodes that have replication stalled may be restarted to resume replication.

This issue is fixed in MongoDB 4.4.14 and 5.0.8.

Original Description

While implementing FLCS related changes in ~~WT-8019~~ a change was made to stop checking if the insert list on the cbt was null prior to checking against the on disk time window. This change may be correct for FLCS but isn't correct for row-store.

This is only a problem if the cbt->slot isn't unset or UINT32_MAX. It's possible that an alternative solution would be to clear the cbt slot on an insert list row search however that is still open for discussion.

is caused by

WT-8019 VLCS snapshot-isolation search mismatch

Closed

is duplicated by

WT-8440 Investigate out of order timestamp assertion fires

Closed

is related to

SERVER-73972 mongodb 4.4 secondary replication hang

Closed

Assignee:: Luke Pearson
Reporter:: Luke Pearson
Votes:: 0 Vote for this issue
Watchers:: 18 Start watching this issue

Created:: Mar 09 2022 01:18:19 AM UTC
Updated:: Oct 29 2023 04:40:05 PM UTC
Resolved:: Mar 11 2022 01:01:36 AM UTC

Details

Description

Original Description

Attachments

Issue Links

Forms

Activity

People

Dates