[SERVER-82967] Stepdown after calling ActiveIndexBuilds::registerIndexBuild() during index build setup doesn't unregister itself Created: 08/Nov/23  Updated: 25/Jan/24  Resolved: 28/Nov/23

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 7.2.1, 7.3.0-rc0, 7.0.5, 6.0.13

Type: Bug Priority: Major - P3
Reporter: Gregory Wlodarek Assignee: Shin Yee Tan
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
Problem/Incident
Assigned Teams:
Storage Execution
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v7.2, v7.0, v6.0, v5.0
Sprint: Execution Team 2023-11-27, Execution Team 2023-12-11
Participants:
Linked BF Score: 16

 Description   

After we get into this state, building the same index on the new primary has the following outcomes:

In debug builds we crash

[js_test:repro] d20040| {"t":{"$date":"2023-11-08T18:38:11.169+00:00"},"s":"I",  "c":"STORAGE",  "id":20661,   "ctx":"ReplWriterWorker-0","msg":"Index build conflict. There's already an index with the same name being built under an existing index build","attr":{"buildUUID":{"uuid":{"$uuid":"8de98a3e-b852-4cbf-a08b-58656f77f966"}},"existingBuildUUID":{"uuid":{"$uuid":"c5ea1251-ae19-45f1-8858-2411dd90abc8"}},"index":"a_1","collectionUUID":{"uuid":{"$uuid":"ab811fbc-6eb1-4b68-9d88-9a2a150a382a"}}}}
[js_test:repro] d20040| {"t":{"$date":"2023-11-08T18:38:11.171+00:00"},"s":"E",  "c":"REPL",     "id":21262,   "ctx":"ReplWriterWorker-0","msg":"Failed command during oplog application","attr":{"command":{"startIndexBuild":"coll","indexBuildUUID":{"$uuid":"8de98a3e-b852-4cbf-a08b-58656f77f966"},"indexes":[{"v":2,"key":{"a":1},"name":"a_1"}]},"db":"test","error":{"code":285,"codeName":"IndexBuildAlreadyInProgress","errmsg":"Index build conflict: 8de98a3e-b852-4cbf-a08b-58656f77f966: There's already an index with name 'a_1' being built on the collection  ( ab811fbc-6eb1-4b68-9d88-9a2a150a382a ) under an existing index build: c5ea1251-ae19-45f1-8858-2411dd90abc8 index build state: Setting up"}}}
[js_test:repro] d20040| {"t":{"$date":"2023-11-08T18:38:11.172+00:00"},"s":"F",  "c":"REPL",     "id":21237,   "ctx":"ReplWriterWorker-0","msg":"Error applying operation","attr":{"oplogEntry":{"oplogEntry":{"op":"c","ns":"test.$cmd","ui":{"$uuid":"ab811fbc-6eb1-4b68-9d88-9a2a150a382a"},"o":{"startIndexBuild":"coll","indexBuildUUID":{"$uuid":"8de98a3e-b852-4cbf-a08b-58656f77f966"},"indexes":[{"v":2,"key":{"a":1},"name":"a_1"}]},"ts":{"$timestamp":{"t":1699468691,"i":2}},"t":2,"v":2,"wall":{"$date":"2023-11-08T18:38:11.116Z"}}},"error":" :: caused by :: IndexBuildAlreadyInProgress: Index build conflict: 8de98a3e-b852-4cbf-a08b-58656f77f966: There's already an index with name 'a_1' being built on the collection  ( ab811fbc-6eb1-4b68-9d88-9a2a150a382a ) under an existing index build: c5ea1251-ae19-45f1-8858-2411dd90abc8 index build state: Setting up"}}
[js_test:repro] d20040| {"t":{"$date":"2023-11-08T18:38:11.173+00:00"},"s":"F",  "c":"REPL",     "id":21235,   "ctx":"OplogApplier-0","msg":"Failed to apply batch of operations","attr":{"numOperationsInBatch":1,"firstOperation":{"oplogEntry":{"op":"c","ns":"test.$cmd","ui":{"$uuid":"ab811fbc-6eb1-4b68-9d88-9a2a150a382a"},"o":{"startIndexBuild":"coll","indexBuildUUID":{"$uuid":"8de98a3e-b852-4cbf-a08b-58656f77f966"},"indexes":[{"v":2,"key":{"a":1},"name":"a_1"}]},"ts":{"$timestamp":{"t":1699468691,"i":2}},"t":2,"v":2,"wall":{"$date":"2023-11-08T18:38:11.116Z"}}},"lastOperation":{"oplogEntry":{"op":"c","ns":"test.$cmd","ui":{"$uuid":"ab811fbc-6eb1-4b68-9d88-9a2a150a382a"},"o":{"startIndexBuild":"coll","indexBuildUUID":{"$uuid":"8de98a3e-b852-4cbf-a08b-58656f77f966"},"indexes":[{"v":2,"key":{"a":1},"name":"a_1"}]},"ts":{"$timestamp":{"t":1699468691,"i":2}},"t":2,"v":2,"wall":{"$date":"2023-11-08T18:38:11.116Z"}}},"failedWriterThread":11,"error":"IndexBuildAlreadyInProgress: Index build conflict: 8de98a3e-b852-4cbf-a08b-58656f77f966: There's already an index with name 'a_1' being built on the collection  ( ab811fbc-6eb1-4b68-9d88-9a2a150a382a ) under an existing index build: c5ea1251-ae19-45f1-8858-2411dd90abc8 index build state: Setting up"}}
[js_test:repro] d20040| {"t":{"$date":"2023-11-08T18:38:11.174+00:00"},"s":"F",  "c":"ASSERT",   "id":23095,   "ctx":"OplogApplier-0","msg":"Fatal assertion","attr":{"msgid":34437,"error":"IndexBuildAlreadyInProgress: Index build conflict: 8de98a3e-b852-4cbf-a08b-58656f77f966: There's already an index with name 'a_1' being built on the collection  ( ab811fbc-6eb1-4b68-9d88-9a2a150a382a ) under an existing index build: c5ea1251-ae19-45f1-8858-2411dd90abc8 index build state: Setting up","file":"src/mongo/db/repl/oplog_applier_impl.cpp","line":624}}
[js_test:repro] d20040| {"t":{"$date":"2023-11-08T18:38:11.175+00:00"},"s":"F",  "c":"ASSERT",   "id":23096,   "ctx":"OplogApplier-0","msg":"\n\n***aborting after fassert() failure\n\n"}

In non-debug builds the following is logged:

[j0:n0] | 2023-11-01T05:33:25.121+00:00 I  STORAGE  20661   [ReplWriterWorker-0] "Index build conflict. There's already an index with the same name being built under an existing index build","attr":{"buildUUID":{"uuid":{"$uuid":"fa8424f2-b104-4d2d-8295-aa58eedebc85"}},"existingBuildUUID":{"uuid":{"$uuid":"b77dd15d-2abf-4181-8017-e8da192a532e"}},"index":"testDb_1","collectionUUID":{"uuid":{"$uuid":"ec105bf2-5987-4aa7-a454-27246446d37c"}}}
[j0:n0] | 2023-11-01T05:33:25.121+00:00 W  REPL     7149001 [ReplWriterWorker-0] "Potential replication constraint violation during steady state replication","attr":{"msg":"received an acceptable error during oplog application","obj":{"oplogEntry":{"op":"c","ns":"admin.$cmd","ui":{"$uuid":"ec105bf2-5987-4aa7-a454-27246446d37c"},"o":{"startIndexBuild":"jstests_rename5","indexBuildUUID":{"$uuid":"fa8424f2-b104-4d2d-8295-aa58eedebc85"},"indexes":[{"v":2,"key":{"testDb":1},"name":"testDb_1"}]},"ts":{"$timestamp":{"t":1698816805,"i":10}},"t":4,"v":2,"wall":{"$date":"2023-11-01T05:33:25.116Z"}}},"status":{"code":285,"codeName":"IndexBuildAlreadyInProgress","errmsg":"Index build conflict: fa8424f2-b104-4d2d-8295-aa58eedebc85: There's already an index with name 'testDb_1' being built on the collection  ( ec105bf2-5987-4aa7-a454-27246446d37c ) under an existing index build: b77dd15d-2abf-4181-8017-e8da192a532e index build state: Setting up"}}

and the index build commit quorum will not be satisfied (different buildUUIDs for the same index build).

When the node was stepped down, nothing about the index build was replicated to the secondaries yet. The affected node never builds the index in the first place as it's interrupted, but we forgot to reset in-memory state.



 Comments   
Comment by Githook User [ 23/Jan/24 ]

Author:

{'name': 'Shin Yee Tan', 'email': 'shinyee.tan@mongodb.com', 'username': 'shinyeet'}

Message: Revert "SERVER-82967 Unregister index builds that fail during set up"

This reverts commit b6104329abef8833da8197057f25029489570a02.

GitOrigin-RevId: a241614e28cde303ec97ce552faf27f59f276019
Branch: v6.0
https://github.com/mongodb/mongo/commit/6c636fbd5a5f5df20858b58eb6490b62e89c26c1

Comment by Githook User [ 03/Jan/24 ]

Author:

{'name': 'Shin Yee Tan', 'email': 'shinyee.tan@mongodb.com', 'username': 'shinyeet'}

Message: SERVER-82967 Unregister index builds that fail during set up

(cherry picked from commit 21a401277ca9bfe07d309bea613eae4165abc70e)
Branch: v7.2
https://github.com/mongodb/mongo/commit/bb271f981b28c9c25b8444e66b4e838da9eb4336

Comment by Githook User [ 15/Dec/23 ]

Author:

{'name': 'Shin Yee Tan', 'email': 'shinyee.tan@mongodb.com', 'username': 'shinyeet'}

Message: SERVER-82967 Unregister index builds that fail during set up

(cherry picked from commit 21a401277ca9bfe07d309bea613eae4165abc70e)

GitOrigin-RevId: b6104329abef8833da8197057f25029489570a02
Branch: v6.0
https://github.com/mongodb/mongo/commit/cd5239d6656057f024bbff9fce729e04cd87ac54

Comment by Githook User [ 13/Dec/23 ]

Author:

{'name': 'Shin Yee Tan', 'email': 'shinyee.tan@mongodb.com', 'username': 'shinyeet'}

Message: SERVER-82967 Unregister index builds that fail during set up

(cherry picked from commit 21a401277ca9bfe07d309bea613eae4165abc70e)

GitOrigin-RevId: 3b39a7ec27afbdff7f9941879ca01cd4ceb610a7
Branch: v7.0
https://github.com/mongodb/mongo/commit/36fc4768d0c4c7bf7b6ed0409551282af31ac2b2

Comment by Githook User [ 28/Nov/23 ]

Author:

{'name': 'Shin Yee Tan', 'email': 'shinyee.tan@mongodb.com', 'username': 'shinyeet'}

Message: SERVER-82967 Unregister index builds that fail during set up
Branch: master
https://github.com/mongodb/mongo/commit/21a401277ca9bfe07d309bea613eae4165abc70e

Generated at Thu Feb 08 06:50:52 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.