[SERVER-14746] IndexRebuilder should only restart index builds initiated internally Created: 31/Jul/14  Updated: 06/Dec/22  Resolved: 03/Mar/20

Status: Closed
Project: Core Server
Component/s: Index Maintenance, Replication
Affects Version/s: 2.5.0
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Mathias Stearn Assignee: Backlog - Storage Execution Team
Resolution: Duplicate Votes: 1
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
duplicates SERVER-43692 enable two phase index builds by default Closed
Related
related to SERVER-43642 Remove IndexBuilder in favor of Index... Closed
related to SERVER-2771 Background index builds on replica se... Closed
related to SERVER-8536 reenable IndexRebuilder for backgroun... Closed
is related to SERVER-39086 Move startup recovery index creation ... Closed
is related to SERVER-39290 Remove startup index recovery redunda... Closed
Tested
Assigned Teams:
Storage Execution
Operating System: ALL
Participants:
Case:

 Description   

It should not restart user-initiated index builds.

Scenario:
1) User starts building a unique index on {a:1} on the current primary, node A.
2) Node A dies suddenly (kill -9 or power failure, etc) and node B becomes the new primary.
3) Node A restarts and the IndexRebuilder restarts building of the index.

At this point, node A now has an index that does not exist on the primary and never will. This would be less of a problem for a non-unique index, but because this is unique, an insert can now succeed on the primary but fail on node A, breaking replication.

I think the right solution is to only rebuild indexes where we would not call logOp on completion.



 Comments   
Comment by Benety Goh [ 03/Mar/20 ]

With two phased index builds enabled by default in SERVER_43692, the scenario in the description should no longer be possible because node B begins building the unique index as soon as it receives the startIndexBuild oplog entry from node A. When node A restarts, it will wait for the commitIndexBuild oplog entry from node B before finalizing its index build.

Comment by Benety Goh [ 03/Mar/20 ]

The startup recovery logic was moved to the IndexBuildsCoordinator in SERVER-39086.

The IndexRebuilder was removed in SERVER-39290.

Comment by Daniel Gottlieb (Inactive) [ 15/May/18 ]

We expect this has been (mostly?) fixed in 3.7 on WT (specifically, all "KV" storage engines) as part of SERVER-33359.

Comment by Mathias Stearn [ 20/Aug/14 ]

This was accidentally closed instead of SERVER-14765. The work on this hasn't been done yet

Comment by Mathias Stearn [ 13/Aug/14 ]

Would need custom backport.

Comment by Githook User [ 13/Aug/14 ]

Author:

{u'username': u'RedBeard0531', u'name': u'Mathias Stearn', u'email': u'mathias@10gen.com'}

Message: SERVER-13951 Split index building in to UnitOfWork-sized stages

All index builds now go through the MultiIndexBuilder as its API was already
close to ideal. The following tickets have also been addressed by this commit:

SERVER-14710 Remove dropDups
SERVER-12309 Cloner build indexes in parallel
SERVER-14737 Initial sync uses bg index building
SERVER-9135 fast index build for initial sync
SERVER-2747 can't kill index in phase 2
SERVER-8917 check error code rather than assuming all errors are dups
SERVER-14820 compact enforces unique while claiming not to
SERVER-14746 IndexRebuilder should be foreground and fail fatally
Branch: master
https://github.com/mongodb/mongo/commit/00913e47de5aced5267e44e82ac9e976bbaac089

Generated at Thu Feb 08 03:35:51 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.