-
Type:
Bug
-
Resolution: Unresolved
-
Priority:
Blocker - P1
-
None
-
Affects Version/s: 8.0.19, 8.0.20, 7.0.31
-
Component/s: None
-
None
-
ALL
-
-
None
-
None
-
None
-
None
-
None
-
None
-
None
MongoDB config server will crash and can't start normally in a cluster which is upgrade from 6.0 version。The crash log is as follows:
with 7.0+ version:
{"t":{"$date":"2026-03-19T17:15:03.886+08:00"},"s":"F", "c":"ASSERT", "id":23079, "ctx":"ReplWriterWorker-2","msg":"Invariant failure","attr":{"expr":"erased","file":"src/mongo/db/s/query_analysis_coordinator.cpp","line":164}}
with 8.0+ version:
{"t":{"$date":"2026-03-17T21:21:24.897+08:00"},"s":"F", "c":"ASSERT", "id":23079, "svc":"S", "ctx":"ReplWriterWorker-3","msg":"Invariant failure","attr":{"expr":"erased","file":"src/mongo/db/s/query_analysis_coordinator.cpp","line":189}}
The main reason for this is that QueryAnalysisCoordinator records `_samplers` when inserting documents into `config.mongos` and cleans up `_samplers` when deleting records. There's also an invariant check after `QueryAnalysisCoordinator::onSamplerDelete _samplers.erase`. However, for clusters upgraded from local versions, `config.mongos` retains information from older versions. These records haven't been inserted after the upgrade, so they're not recorded in `_samplers`. This causes `_samplers.erase` to return 0 during deletion, leading to invariant failure and process crash.
void QueryAnalysisCoordinator::onSamplerDelete(const MongosType& doc) {
invariant(serverGlobalParams.clusterRole.has(ClusterRole::ConfigServer));
stdx::lock_guard<Latch> lk(_mutex); auto erased = _samplers.erase(doc.getName());
invariant(erased);
}
I think we need to optimize the logic and remove `invariant(erased)`.