[SERVER-33359] Have RTT storage engines manage rolling back incomplete index builds. Created: 15/Feb/18  Updated: 29/Oct/23  Resolved: 22/Feb/18

Status: Closed
Project: Core Server
Component/s: Replication, Storage
Affects Version/s: None
Fix Version/s: 3.7.3

Type: Task Priority: Major - P3
Reporter: Daniel Gottlieb (Inactive) Assignee: Daniel Gottlieb (Inactive)
Resolution: Fixed Votes: 0
Labels: rollback-functional
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Documented
is documented by DOCS-11381 Docs for SERVER-33359: Have RTT stora... Closed
Duplicate
is duplicated by SERVER-32982 Remove rolled back index builds after... Closed
is duplicated by SERVER-33322 reconcileCatalogAndIdents() should al... Closed
Related
Backwards Compatibility: Fully Compatible
Sprint: Repl 2018-02-26
Participants:

 Description   

In an RTT world, nodes must retain enough history to undo writes back to the replica set commit point. However, replication does not always communicate enough information to reliably know what the state should be at the commit point.

Specifically, the view of a database when an index build is occurring is not sufficient to know whether the index creation had committed, or it was rolled back. The lifetime of an index build on a primary:

  1. Start an index build at time S. This writes an entry to the local catalog with a `ready: false` flag. S is not known to any other node.
  2. Build the index
  3. Complete the index build at time F. This atomically commits the index, setting `ready: true` and writing an oplog entry with time `F`.

An index build on a secondary:

  1. Observe an oplog entry to create an index with time `F`. Add the `ready: false` entry to the catalog.
  2. Build the index
  3. Complete index build, set the index to `ready: true`. A foreground index will finish at time `F`. A background index finishes at time `BF`.

Storage only knows about the existence of the catalog entry and the value of its `ready` flag. to make a decision on whether the index build should be completed. In both cases, if the entry does not exist, the index should be removed. If the entry does exist with `ready: true`, it should remain.

However, the primary case should roll back the case where the index entry exists, but is `ready: false` as this represents a time before it wrote out the oplog entry. A secondary must keep the index given the same inputs as the oplog entry was written before the index build started. Losing the index would be a bug.

This ticket is for index builds to communicate enough information to the storage engine to disambiguate the decision. The storage engine may choose persist this information as it sees fit. It's only expected to be of value for RTT storage engines.

Specifically what will be communicated is whether the index to be built is a background index build being started on a secondary. Foreground index builds on a secondary will not show as "in progress" following a call to RTT, either the index is not in the catalog, or the entry exists and the index is usable.



 Comments   
Comment by Daniel Gottlieb (Inactive) [ 22/Feb/18 ]

--noIndexBuildRetry is no longer compatible with --replSet.

  • --noIndexBuildRetry is still available to run in standalone mode to quickly examine data without requiring unfinished indexes to be built.
  • This ticket includes changes where indexes should not need to be rebuilt after a crash as often. For example, an initial syncing node that crashes and is restarted should no longer attempt to rebuild indexes from the previous initial sync just to be discarded.
    • Specifically, only background index builds initiated while the node is in a "SECONDARY" state will be rebuilt on startup following a crash.
Comment by Githook User [ 22/Feb/18 ]

Author:

{'email': 'daniel.gottlieb@mongodb.com', 'name': 'Daniel Gottlieb', 'username': 'dgottlieb'}

Message: SERVER-33359: Allow RTT storage engines to manage indexes on rollback.
Branch: master
https://github.com/mongodb/mongo/commit/80b1a54a112b5853d0903ae424ffc5e3bb289077

Generated at Thu Feb 08 04:33:08 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.