-
Type:
Task
-
Resolution: Unresolved
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Summary
jstest + design doc for SERVER-88352: TTL deleter silently accepts NamespaceNotFound mid-batch, hiding races between drop and TTL-fire. Operators lose visibility into TTL skip causes.
Files
- jstests/noPassthrough/ttl/ttl_namespace_not_found_visibility.js (123 lines) — single-node ReplSet repro; parks TTLMonitor with hangTTLMonitorBetweenPasses, drops the collection mid-park, releases exactly one pass, then asserts (today) that deletedDocuments did not advance and (pre-registered, skip-gated) that serverStatus().metrics.ttl.skipReasons.namespaceNotFound ticked while orphan / other stayed at zero. Matching currentOp assertions for {skipReasons, lastSkip: {uuid, reason, ts}} use the same skip-until-implemented pattern.
- src/mongo/db/ttl/TTL_SKIP_REASON_VISIBILITY.md (477 words)
Key code-site finding
Silent branch identified: ttl_monitor.cpp:460-469 (if (!nss) branch after lookupNSSByUUID returns boost::none). Proposed three Counter64 metrics next to the existing block at :131-134, a currentOp.ttl sub-document with histogram + lastSkip triple, and a debug LOGV2_DEBUG on the previously-silent branch. Framed against SERVER-43194 (surface existing tracked decisions rather than re-architect them) and keeps "orphan" reserved for SERVER-92779.
Key finding
SERVER-88352's stated cause is the configShard window (sharding filter metadata not yet attached), but the same silent branch eats drop-races and any stale TTLCollectionCache entry — all three converge on one indistinguishable no-op. Histogram with three separate Counter64s keeps them addressable.
- is related to
-
SERVER-88352 Figure out how to handle NamespaceNotFound error in TTL Index Delete
-
- Backlog
-
-
SERVER-92779 TTL delete progress blocked by unowned documents
-
- Needs Scheduling
-
-
SERVER-43194 provide a way to get result/outcome of $merge or $out
-
- Backlog
-