Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Fixed
Priority: Major - P3
Fix Version/s: 7.3.0-rc0
Affects Version/s: None
Component/s: None
Labels:
None

Assigned Teams:

Catalog and Routing
Backwards Compatibility:
Fully Compatible
Sprint:
CAR Team 2023-12-25
Linked BF Score:
14
Story Points:
1
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

ShardServerCatalogCacheLoader::getChunkSince can throw StaleConfig under some interleavings between reading the cache and the background thread that persists the materialized cache. In practice, the CatalogCache handles this by retrying, so it doesn't cause harm.
However, this race can cause failures on the shard_server_catalog_cache_loader_test unit test (e.g here). We can address this by making the test expect and retry this failure. Alternatively, we could make ShardServerCatalogCacheLoader retry itself.

The interleaving that can cause this is:
1. SSCCL discovers the new epoch.
2. Next, it schedules an asynchronous task to update the persisted metadata.
3. Next, it calls `_getLoaderMetadata`, which calls `getIncompletePersistedMetadataSinceVersion`, which calls `getPersistedMetadataSinceVersion`, which finally calls `readShardChunks`. readShardChunks reads from the config.cache.xxxx collection.
4. Concurrently with the read (3), the task scheduled at (2) proceeds to drop the config.cache.xxxx collection (because the epoch has changed).
5. The read started at (3) yields and on restore it discovers that the collection no longer exists, therefore it fails with QueryPlanKilled.

related to

SERVER-86013 Fix retry for getChunksSince in shard_server_catalog_cache_loader_test

Closed

Assignee:: David Dominguez Sal (Inactive)
Reporter:: Jordi Serra Torrens
Participants:: David Dominguez Sal, Githook User, Jordi Serra Torrens
Votes:: 0 Vote for this issue
Watchers:: 3 Start watching this issue

Created:: Nov 22 2023 01:19:32 PM UTC
Updated:: Jan 31 2024 11:17:58 AM UTC
Resolved:: Dec 21 2023 03:31:16 PM UTC

Details

Description

Attachments

Issue Links

Activity

People

Dates