[SERVER-49249] initial sync fails invariant during oplog application due to conflict with in-progress index build Created: 01/Jul/20  Updated: 29/Oct/23  Resolved: 19/Nov/20

Status: Closed
Project: Core Server
Component/s: Storage
Affects Version/s: None
Fix Version/s: 4.9.0

Type: Bug Priority: Major - P3
Reporter: Benety Goh Assignee: Benety Goh
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: Execution Team 2020-11-30
Participants:
Linked BF Score: 10

 Description   

Initial sync may fail this invariant in MultiIndexBlock::init() due to a conflict between single and two phase index builds using the same index name during the oplog application phase. This issue is believed to have been introduced by SERVER-47182 which affects 4.5 only.

In SERVER-47182 (post-4.4), we started using an alternate code path to apply createIndexes on on secondaries (which also covers nodes running initial sync).

This race may happen when we have built an index with the same name using both single-phase and two-phase methods on the  collection that is being initial sync'ed. The following order of operations on the collection may lead to the invariant failure:

  • secondary: initial sync starts
  • primary: create (empty) collection
  • primary: create index X (single-phased)
  • primary: insert documents
  • primary: drop index X
  • primary: start building index X (two-phased)
  • secondary: clones collection with index X still building
  • primary: complete building index X (two-phased)
  • secondary: completes collection cloning and starts applying oplog entries to catch up
  • secondary: applies first index X oplog entry (createIndex) and hits invariant because the two-phase index build for X is still not committed.

This is not an issue in 4.4 because we would have filtered out existing indexes (including in-progress index builds) and ignored the createIndexes entry.

The new code path introduced in SERVER-47182 does not filter out in-progress index builds, leading to the invariant observed in this ticket.



 Comments   
Comment by Githook User [ 19/Nov/20 ]

Author:

{'name': 'Benety Goh', 'email': 'benety@mongodb.com', 'username': 'benety'}

Message: SERVER-49249 initial sync skips single-phase index build if there is an index build already in progress
Branch: master
https://github.com/mongodb/mongo/commit/77554e9e4fd18811d6df84d8934c888814d034ec

Comment by Githook User [ 19/Nov/20 ]

Author:

{'name': 'Benety Goh', 'email': 'benety@mongodb.com', 'username': 'benety'}

Message: SERVER-49249 add js test single phase index build conflict during initial sync
Branch: master
https://github.com/mongodb/mongo/commit/fd5f9c3fb195b78f59b7abcfc7a81318f9a5ee78

Generated at Thu Feb 08 05:19:21 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.