-
Type:
Bug
-
Resolution: Fixed
-
Priority:
Major - P3
-
Affects Version/s: None
-
Component/s: None
-
None
-
Storage Execution
-
Fully Compatible
-
ALL
-
Storage Execution 2026-06-08
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Overview
On a primary-driven index build (PDIB), when the primary steps down the in-memory build on that node is torn down so the node can finish the index as a secondary (the new primary resumes/commits the build). This teardown currently increments the index_builds.failed OTEL counter, even though the build did not fail – it was handed off and will be resumed.
Background
The build is unregistered via ActiveIndexBuilds::unregisterIndexBuild(..., IndexBuildOutcome::kFailure) in the PDIB stepdown cleanup paths. The new primary separately records index_builds.resume.succeeded when it resumes the build, so the old primary recording a failure double-counts and makes normal failover look like an error.
Scope of Work
- src/mongo/db/index_builds/active_index_builds.h – add a neutral IndexBuildOutcome::kToBeResumed outcome.
- src/mongo/db/index_builds/active_index_builds.cpp – recordIndexBuildOutcome treats kToBeResumed as a no-op (neither succeeded nor failed; the active gauge is still decremented).
- src/mongo/db/index_builds/index_builds_coordinator.cpp – route the three PDIB stepdown teardown sites (LOGV2 12741700 / 12741701 / 12741702) through kToBeResumed.
Acceptance Criteria
- A PDIB interrupted by stepdown does not increment index_builds.failed on the stepped-down node.
- Genuine aborts, setup failures, and shutdown still record index_builds.failed.
- jstests/noPassthrough/index_builds/index_stepdown_failover.js under PDIB expects the stepped-down primary to record the build as neither succeeded nor failed.
Technical Notes
- The pre-init stepdown path (index_stepdown_before_init.js) fails during _setUpIndexBuild before completeSetup() and still records a failure – out of scope and unchanged.