Loading...

Type: Task
Resolution: Unresolved
Priority: Major - P3
Fix Version/s: None
Affects Version/s: None
Component/s: None
Labels:
None

Assigned Teams:

Storage Execution
Sprint:
Storage Execution 2026-06-08
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

Background

~~SERVER-123553~~ promoted a LOGV2_DEBUG log line in src/mongo/db/index/index_access_method.cpp to LOGV2_WARNING. The original log reported a "Suppressed key generation error" – an error being silently swallowed whose only signal lived at debug level, effectively invisible in production.

As a follow-up, claude audited all storex-owned code (timeseries, bucket_catalog, collection_write_path, bsoncolumn, external_sorter, index_builds, key_string, storage_engine_integration, ttl, tracking_allocators) for other LOGV2_DEBUG call sites fitting the same pattern: an abnormal or suppressed condition that operators need to see in production but is currently hidden behind debug verbosity.

This ticket tracks the candidates found. Each can be evaluated / promoted independently; feel free to spin sub-tickets per module if preferred.

Strong candidates (clear SERVER-123553 pattern)

1. `bucket_compression.cpp:382` – swallowed compression exception

File: src/mongo/db/timeseries/bucket_compression.cpp
Current: LOGV2_DEBUG(5857800, 1, "Exception when compressing timeseries bucket, leaving it uncompressed", "error"_attr = exceptionToStatus())
Suggested: LOGV2_WARNING
Why: The outermost catch (...) inside compressBucket() absorbs any exception and returns an empty CompressionResult. Callers (e.g. compressUncompressedBucketOnReopen in timeseries_write_ops_internal.cpp) then uassert with a generic BucketCompressionFailure message that discards the original exception's type and text. Only the debug log carries the real exceptionToStatus().

2. `ttl_monitor.cpp:221` – silent skip of registered TTL index

File: src/mongo/db/ttl/ttl_monitor.cpp
Current: LOGV2_DEBUG(22535, 1, "index not found; skipping ttl job", "index"_attr = spec)
Suggested: LOGV2_WARNING (or LOGV2_ERROR to match siblings)
Why: In getValidTTLIndex, the two sibling "skipping TTL job" branches log at LOGV2_ERROR (22541, 6909100) and increment ttlInvalidTTLIndexSkips. The catalog-lookup-miss branch alone logs at debug with no metric bump, so a registered TTL index can silently stop running if its catalog entry disappears between isIndexReady and findIndexByName.

3. `storage_stats.cpp:324` – LockTimeout/MaxTimeMSExpired silently swallowed

File: src/mongo/db/stats/storage_stats.cpp
Current: LOGV2_DEBUG(3088801, 2, "Failed to retrieve storage statistics", logAttrs(nss), "error"_attr = ex), followed by return Status::OK()
Suggested: Split the two catch handlers. Leave MaxTimeMSExpired at debug (user-configurable, expected). Raise the LockTimeout~~with~~waitForLock=true case to LOGV2_WARNING.
Why: The failed() lambda returns Status::OK() on LockTimeout/MaxTimeMSExpired, so callers see "success" with a result BSON missing every storage field. Current level is debug-2, even quieter than typical level-1 suppressed cases.

Secondary candidates (worth promoting)

4. `checkpointer.cpp:144` – slow checkpoint (>=30s)

File: src/mongo/db/storage/checkpointer.cpp
Current: LOGV2_DEBUG(22308, 1, "Checkpoint was slow to complete", "secondsElapsed"_attr = secondsElapsed)
Suggested: LOGV2_WARNING
Why: Gate already fires only on a 30-second+ checkpoint – a genuine performance anomaly (cache pressure, IO stall, disk contention) that operators shouldn't have to raise verbosity to see.

5. `multi_index_block.cpp:1297` – abort due to mixed-schema data

File: src/mongo/db/index_builds/multi_index_block.cpp
Current: LOGV2_DEBUG(6057701, 1, "Aborting index build commit due to the earlier detection of mixed-schema data", ...)
Suggested: LOGV2_WARNING
Why: The earlier detection log at line 989 is plain LOGV2(6057700, "Detected mixed-schema data ...") (info). The terminal "we are failing because of it" log being debug-1 is inverted: operators see the detection but not the abort. Returns CannotCreateIndex.

6. `oplog_truncation.cpp:86` and `:91` – silent non-truncation

File: src/mongo/db/storage/oplog_truncation.cpp
Current:

LOGV2_DEBUG(5140900, 0, "Will not truncate entire oplog");
LOGV2_DEBUG(5140901, 0, "Cannot truncate as there are no oplog entries after the truncate marker but before the truncate-up-to point", ...);

Suggested: LOGV2_WARNING (or at minimum LOGV2 / info)
Why: Both conditions cause reclaimOplog to silently return false, meaning an oplog-truncation attempt did nothing. The sibling error path already logs at LOGV2_WARNING(8841101). Silent non-truncation is operationally significant (oplog can grow unbounded); 5140901 in particular indicates mayTruncateUpTo is pinning too early – an "oplog stall / pinned" condition.

Borderline / worth discussion

7. `storage_engine_impl.cpp:769` – drop reported success but ident still exists

File: src/mongo/db/storage/storage_engine_impl.cpp
Current: LOGV2_DEBUG(11440001, 1, "Internal ident already exists on disk after drop attempt; reusing existing ident", "ident"_attr = ident)
Suggested: LOGV2_WARNING, possibly with rate-limiting.
Why: Reached after a dropIdent() reported success but the subsequent createInternal() still throws ObjectAlreadyExists. Surrounding comment explicitly references SERVER-114575 ("a layered drop can report success while not dropping the table"). If this fires more than transiently it signals real inconsistency. Medium confidence because it may be noisy under known layered-drop conditions.

8. `index_builds_coordinator_mongod.cpp:198` – vote-command retry loop is silent

File: src/mongo/db/index_builds/index_builds_coordinator_mongod.cpp
Current: LOGV2_DEBUG(4666400, 1, "Failed to run index build vote command.", ...) inside while (needToVote()) with infinite exponential-backoff retry.
Suggested: Either add a logSeveritySuppressor (see existing usage at index_builds_coordinator.cpp:2678) so sustained failures become visible while transient ones stay quiet, or promote directly to LOGV2_WARNING.
Why: A persistently unreachable primary leaves an index-build vote silently stuck – operators see nothing about the stall.

Scope covered

src/mongo/db/timeseries/ (incl. bucket_catalog)
src/mongo/db/index_builds/ + index/index_access_method*, index_key_validate*, index_repair*, index_spec_validate*, commit-quorum/vote commands
src/mongo/db/storage/ (excluding wiredtiger/, devnull/, key_string/)
Storage-integration ancillary: action_duration_metrics*, record_id*, server_recovery*, mongod_options_storage*, dbhelpers*, import*, commands/fsync*, commands/dbcommands*, commands/resize_oplog*, op_observer/batched_write*, stats/storage_stats*
src/mongo/db/ttl/, src/mongo/db/sorter/, src/mongo/bson/column/, src/mongo/util/tracking/, src/mongo/db/storage/key_string/, src/mongo/db/collection_crud/, src/mongo/db/replicated_fast_count/

Roughly 130+ LOGV2_DEBUG/LOGV2_INFO sites reviewed; most were either normal progress/tracing or had an already-WARNING/ERROR sibling on the genuine error branch.

Notes for implementation

Log IDs should be kept (same ID, new severity) to preserve log-id continuity, per the same approach as ~~SERVER-123553~~.
A few of these (#6, #7) may benefit from message rephrasing as ~~SERVER-123553~~ did (adding context like "on secondary" / "during oplog application") – worth considering per site.

is related to

SERVER-123553 Change index key generation log line error to LOGV2_WARNING

Closed

Details

Description

Background

Strong candidates (clear SERVER-123553 pattern)

1. bucket_compression.cpp:382 – swallowed compression exception

2. ttl_monitor.cpp:221 – silent skip of registered TTL index

3. storage_stats.cpp:324 – LockTimeout/MaxTimeMSExpired silently swallowed

Secondary candidates (worth promoting)

4. checkpointer.cpp:144 – slow checkpoint (>=30s)

5. multi_index_block.cpp:1297 – abort due to mixed-schema data

6. oplog_truncation.cpp:86 and :91 – silent non-truncation

Borderline / worth discussion

7. storage_engine_impl.cpp:769 – drop reported success but ident still exists

8. index_builds_coordinator_mongod.cpp:198 – vote-command retry loop is silent

Scope covered

Notes for implementation

Attachments

Issue Links

Activity

People

Dates