[SERVER-46397] concurrent dropIndexes on a primary can stall replication on secondaries Created: 25/Feb/20  Updated: 29/Oct/23  Resolved: 11/Mar/20

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 4.4.0-rc0, 4.7.0

Type: Bug Priority: Major - P3
Reporter: Louis Williams Assignee: Louis Williams
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
is depended on by SERVER-46594 Enable commit quorum for concurrency ... Closed
Related
related to SERVER-46595 createIndexes command fails to abort ... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v4.4
Sprint: Execution Team 2020-03-09, Execution Team 2020-03-23
Participants:
Case:
Linked BF Score: 50

 Description   

If there are two concurrent "dropIndexes" commands on a collection (one on each index A and B), this can lead to the following order of operations on the secondary:

  • startIndexBuild A
  • startIndexBuild B
  • commitIndexBuild A
  • dropIndexes A -> blocks due concurrent index build B
  • abortIndexBuild B -> never reached due to blocking previous operation (from a separate dropIndexes command)

The blocking is caused by the assertion that no background operations are in progress when processing the dropIndexes oplog entry for "a_1". This throws and blocks oplog application. dropIndexesForApplyOps assumes that no background operations are in progress, which is not always the case. The non-applyOps version of dropIndexes has a similar assertion which only gets executed when no index builds were aborted.

What I think might be the root causes of this failure:



 Comments   
Comment by Githook User [ 11/Mar/20 ]

Author:

{'name': 'Louis Williams', 'username': 'louiswilliams', 'email': 'louis.williams@mongodb.com'}

Message: SERVER-46397 Only report an index build as aborted if it is currently aborting, not committing

This expands the concurrency control features used by two-phase index
builds to standalone nodes and single-phase index builds so that
concurrent commits and aborts behave correctly.

(cherry picked from commit 2fd22e51e07c93a78a67cbb8d01289b96cb7f60a)

SERVER-46397 add missing 'break'

(cherry picked from commit 3848ef2467665a7c7756eb19b42cb0f523c03535)

SERVER-46397 When an index build abort is received after a stepDown, wait for another signal

Reinstates a logical condition that was removed by
2fd22e51e07c93a78a67cbb8d01289b96cb7f60a

Adds an IndexBuildAction specific to committing a single-phase build

(cherry picked from commit ad244f716cb7478c990a79b196f4877975c74613)
Branch: v4.4
https://github.com/mongodb/mongo/commit/6941234fee919185c2545a8170e03cf37e8b3d41

Comment by Githook User [ 11/Mar/20 ]

Author:

{'name': 'Louis Williams', 'username': 'louiswilliams', 'email': 'louis.williams@mongodb.com'}

Message: SERVER-46397 When an index build abort is received after a stepDown, wait for another signal

Reinstates a logical condition that was removed by
2fd22e51e07c93a78a67cbb8d01289b96cb7f60a

Aadds an IndexBuildAction specific to committing a single-phase build
Branch: master
https://github.com/mongodb/mongo/commit/ad244f716cb7478c990a79b196f4877975c74613

Comment by Githook User [ 10/Mar/20 ]

Author:

{'name': 'Louis Williams', 'username': 'louiswilliams', 'email': 'louis.williams@mongodb.com'}

Message: SERVER-46397 add missing 'break'
Branch: master
https://github.com/mongodb/mongo/commit/3848ef2467665a7c7756eb19b42cb0f523c03535

Comment by Githook User [ 10/Mar/20 ]

Author:

{'username': 'louiswilliams', 'name': 'Louis Williams', 'email': 'louis.williams@mongodb.com'}

Message: SERVER-46397 Only report an index build as aborted if it is currently aborting, not committing

This expands the concurrency control features used by two-phase index
builds to standalone nodes and single-phase index builds so that
concurrent commits and aborts behave correctly.
Branch: master
https://github.com/mongodb/mongo/commit/2fd22e51e07c93a78a67cbb8d01289b96cb7f60a

Generated at Thu Feb 08 05:11:22 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.