-
Type:
Bug
-
Resolution: Fixed
-
Priority:
Major - P3
-
Affects Version/s: 5.1.0
-
Component/s: None
-
None
-
Fully Compatible
-
Sharding EMEA 2021-10-04
-
127
-
None
-
None
-
None
-
None
-
None
-
None
-
None
As result of the removal of ComparableDatabaseVersion::_uuidDisambiguatingSequenceNum under SERVER-59177, it is currently possible for the catalog cache to end up in an infinite loop since the timeInStore of a cached entry is considered greater than the time in store associated with a non-existing db.
Unit test to reproduce the issue (by sergi.mateo-bellido):
TEST_F(CatalogCacheTest, GetDatabaseDropLoop) {
const auto dbName = "testDB";
const auto dbVersion = DatabaseVersion(UUID::gen(), Timestamp());
_catalogCacheLoader->setDatabaseRefreshReturnValue(
DatabaseType(dbName, kShards[0], true, dbVersion));
// The CatalogCache doesn't have any valid info about this DB and finds a new DatabaseType
auto swDatabase = _catalogCache->getDatabase(operationContext(), dbName);
ASSERT_OK(swDatabase.getStatus());
const auto cachedDb = swDatabase.getValue();
ASSERT_TRUE(cachedDb.shardingEnabled());
ASSERT_EQ(cachedDb.databaseVersion().getUuid(), dbVersion.getUuid());
ASSERT_EQ(cachedDb.databaseVersion().getLastMod(), dbVersion.getLastMod());
// Advancing the timeInStore, e.g. because of a movePrimary
_catalogCache->onStaleDatabaseVersion(dbName, dbVersion.makeUpdated());
// However, when this CatalogCache asks to the loader for the new info associated to dbName it
// didn't find any (i.e. the database was dropped)
_catalogCacheLoader->setDatabaseRefreshReturnValue(
Status(ErrorCodes::NamespaceNotFound, "dummy errmsg"));
// At the end of this refresh we compute a new time of (boost::none, 1), which is going to be
// less than the time in store we associated with this refresh (i.e. the DB version after the
// movePrimary). Because of this bug, we will never be able to fulfill any promise, hanging in
// the refresh loop forever.
swDatabase = _catalogCache->getDatabase(operationContext(), dbName);
}