[SERVER-43652] Secondary reads right after change shard key can stall on refresh Created: 26/Sep/19  Updated: 06/Dec/22  Resolved: 04/Oct/19

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: 4.2.0
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Randolph Tan Assignee: [DO NOT USE] Backlog - Sharding Team
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Related
is related to SERVER-43217 Secondaries can hang refreshing metad... Closed
Assigned Teams:
Sharding
Operating System: ALL
Participants:
Linked BF Score: 8

 Description   

Sequence:
1. Shard collection on primary, version is now (1,0|e1).
2. Metadata refresh happens, insert { refreshing: true } to config.cache.collections.
3. Metadata refresh finish, update config.cache.collections with { refreshing: false, lastRefreshedCollectionVersion: (1,0) }
4. Change shard key on same collection, version is now (1,0|e2)
5. Metadata refresh happens, update { refreshing: true } to config.cache.collections.
6. Metadata refresh finish, update config.cache.collections with { refreshing: false, lastRefreshedCollectionVersion: (1,0) }. However, since lastRefreshedCollectionVersion didn't change, oplog entry will not contain it.

So, if a secondary refresh happened in between step 5 & 6, it will block on a notification waiting for refresh false. However, the condition for triggering the refresh requires lastRefreshedCollectionVersion to be present in the oplog, it will be missed. And it will block until the next refresh bumps the version higher.


Generated at Thu Feb 08 05:03:43 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.