Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Done
Priority: Major - P3
Fix Version/s: None
Affects Version/s: None
Component/s: None
Labels:
- car-investigation

Assigned Teams:

Catalog and Routing
Backwards Compatibility:
Fully Compatible
Operating System:
ALL
Sprint:
CAR Team 2024-03-04, CAR Team 2024-03-18
Confidence Status:
None
Work Order:
3

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

Suppose we have a shard that's attempting to commit a DDL operation. Before doing so we may refresh data from the config shard in order to verify if a previous node already did so and failed after doing the operation on the config shard.

This behavior is problematic if we rely on the gossiped Vector Clock since we could end up mistakenly failing the check above and performing the same operation twice.

This can occur in the following scenario:

Shard S1 has three nodes.
Config Shard CS has three nodes.
S1's Primary commits the DDL operation on CS with majority writeConcern and performs a stepdown before it persists the new vector clock.
S1's new primary chosen has the previous Vector Clock.
S1's new primary refreshes its catalog metadata by contacting a stale CS node that is still observing the old Vector Clock and is at a stale majority timestamp. This can happen because we do not have a PrimaryOnly readPreference for this read.
S1's new primary fails the check since from it's perspective we're still in the old pre-commit world.
S1's new primary then re-commits the DDL operation.

is related to

SERVER-87977 Add the explicit replay protection to the commit phase of the sharding ConvertToCappedCoordinator

Closed

Assignee:: Paolo Polato
Reporter:: Jordi Olivares Provencio
Participants:: Jordi Olivares Provencio, Paolo Polato
Votes:: 0 Vote for this issue
Watchers:: 9 Start watching this issue

Created:: Feb 13 2024 04:02:24 PM UTC
Updated:: Mar 14 2024 03:05:24 PM UTC
Resolved:: Mar 14 2024 03:05:23 PM UTC
Confidence Status Last Update:: 04/Mar/24 7:29 AM

Details

Description

Attachments

Issue Links

Activity

People

Dates