Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Works as Designed
Priority: Major - P3
Fix Version/s: None
Affects Version/s: 4.0.9
Component/s: Sharding
Labels:
- sharding-wfbf-day

Assigned Teams:

Sharding
Operating System:
ALL
Sprint:
Sharding 2019-07-01, Sharding 2019-07-15, Sharding 2019-09-09
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

ShardRegistry::reload() on a config server waits for majority read on a local shard. If it coincides with the LogicalSessionsCache::refresh() which performs batch writes it may end up in the deadlock while calling ShardRegistry::getShard() while refreshing collectionRoutingInfo which can join the reload().
The related stack traces are in the BF-12772

Suggested Fix

I propose to check in ReplicationCoordinatorImpl::waitUntilOpTimeForRead if secondaries are up or down. It should behave similarly to the case when _isShutdown flag is set.

Assignee:: [DO NOT USE] Backlog - Sharding Team
Reporter:: Misha Tyulenev (Inactive)
Participants:: [DO NOT USE] Backlog - Sharding Team, Kaloian Manassiev, Matthew Saltz, Misha Tyulenev
Votes:: 0 Vote for this issue
Watchers:: 6 Start watching this issue

Created:: May 17 2019 06:44:54 PM UTC
Updated:: Oct 27 2023 01:53:13 PM UTC
Resolved:: Sep 05 2019 01:31:55 PM UTC

Details

Description

Suggested Fix

Attachments

Activity

People

Dates