Details
-
Bug
-
Resolution: Fixed
-
Major - P3
-
5.1.0
-
None
-
None
-
Fully Compatible
-
Sharding EMEA 2021-10-04
-
127
Description
As result of the removal of ComparableDatabaseVersion::_uuidDisambiguatingSequenceNum under SERVER-59177, it is currently possible for the catalog cache to end up in an infinite loop since the timeInStore of a cached entry is considered greater than the time in store associated with a non-existing db.
Unit test to reproduce the issue (by sergi.mateo-bellido):
TEST_F(CatalogCacheTest, GetDatabaseDropLoop) {
|
const auto dbName = "testDB";
|
const auto dbVersion = DatabaseVersion(UUID::gen(), Timestamp());
|
|
|
_catalogCacheLoader->setDatabaseRefreshReturnValue(
|
DatabaseType(dbName, kShards[0], true, dbVersion));
|
|
|
// The CatalogCache doesn't have any valid info about this DB and finds a new DatabaseType
|
auto swDatabase = _catalogCache->getDatabase(operationContext(), dbName);
|
|
|
ASSERT_OK(swDatabase.getStatus());
|
const auto cachedDb = swDatabase.getValue();
|
ASSERT_TRUE(cachedDb.shardingEnabled());
|
ASSERT_EQ(cachedDb.databaseVersion().getUuid(), dbVersion.getUuid());
|
ASSERT_EQ(cachedDb.databaseVersion().getLastMod(), dbVersion.getLastMod());
|
|
|
// Advancing the timeInStore, e.g. because of a movePrimary
|
_catalogCache->onStaleDatabaseVersion(dbName, dbVersion.makeUpdated());
|
|
|
// However, when this CatalogCache asks to the loader for the new info associated to dbName it
|
// didn't find any (i.e. the database was dropped)
|
_catalogCacheLoader->setDatabaseRefreshReturnValue(
|
Status(ErrorCodes::NamespaceNotFound, "dummy errmsg"));
|
|
|
// At the end of this refresh we compute a new time of (boost::none, 1), which is going to be
|
// less than the time in store we associated with this refresh (i.e. the DB version after the
|
// movePrimary). Because of this bug, we will never be able to fulfill any promise, hanging in
|
// the refresh loop forever.
|
swDatabase = _catalogCache->getDatabase(operationContext(), dbName);
|
}
|