[SERVER-58280] initial sync hangs on hiding dropped index when index builds are active Created: 06/Jul/21  Updated: 29/Oct/23  Resolved: 13/Jul/21

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 5.0.2, 4.4.9, 5.1.0-rc0

Type: Improvement Priority: Major - P3
Reporter: Benety Goh Assignee: Benety Goh
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Related
is related to SERVER-46659 Make initial sync work with two phase... Closed
is related to SERVER-55008 Only abort two-phase index builds whe... Closed
is related to SERVER-56019 timeseries_index.js hangs in burn_in_... Closed
Backwards Compatibility: Fully Compatible
Backport Requested:
v5.0, v4.4
Sprint: Execution Team 2021-07-26
Participants:

 Description   

During initial sync, incomplete index builds on the sync source during the collection cloning phase are started, but not completed, on the initial syncing node. After cloning the collections, if there are any conflicting operations on the collection before committing or aborting the index build, we will interrupt the index builds with the understanding that they will be restarted/aborted on processing an index build commit or abort operation.

The current list of conflicting operations, defined in SERVER-46659, includes collection and index drops as well as collection renames. However, we should consider adding index hide/unhide, which are replicated as collMod commands, to this list so that initial sync will interrupt the index builds before applying the collMod command.

The rationale for interrupting index builds during initial sync for hide/unhide index operations can be illustrated by the following sequence of operations on the sync source:

On sync source,

  • create index a_1
  • hide index a_1 <--- startTimestamp for initial sync
  • unhide index a_1
  • drop index a_1
  • create index a_1 <---- collection cloner starts index build for second a_1 index

On initial syncing node,

  • clones collection and starts index build for a_1
  • completes collection cloning, starts apply oplog entries from 'startTimestamp'
  • applying collMod for hiding inde fails with BackgroundOperationInProgressForCollection
  • does not interrupt index build for a_1
  • waits for initial builds to complete for collection, which leads to the initial syncing node hanging.


 Comments   
Comment by Vivian Ge (Inactive) [ 06/Oct/21 ]

Updating the fixversion since branching activities occurred yesterday. This ticket will be in rc0 when it’s been triggered. For more active release information, please keep an eye on #server-release. Thank you!

Comment by Githook User [ 10/Aug/21 ]

Author:

{'name': 'Benety Goh', 'email': 'benety@mongodb.com', 'username': 'benety'}

Message: SERVER-58280 initial sync aborts index builds before applying collMod

This fixes initial sync issues during oplog application when indexes are
being hidden/unhidden - these index operations are encoded as collMod
commands.

(cherry picked from commit 8922a0ea148c2d883ce724190e0d20a2e2bfd253)
Branch: v4.4
https://github.com/mongodb/mongo/commit/e1802cd162d43c2cf3c647a7f3cdfb93c149eb90

Comment by Githook User [ 09/Aug/21 ]

Author:

{'name': 'Benety Goh', 'email': 'benety@mongodb.com', 'username': 'benety'}

Message: SERVER-58280 add base test for handling collMod during initial sync on collections with index builds

The initial version of this test is identical to initial_sync_aborts_two_phase_index_builds.js.

(cherry picked from commit 7dc0e9f922fc025f8b6e6b7962398b9bd41f9570)
Branch: v4.4
https://github.com/mongodb/mongo/commit/58fb882fea697627e40a8d5cca9a34785c3e9659

Comment by Benety Goh [ 09/Aug/21 ]

4.4 backport may be affected by SERVER-55008.

Comment by Githook User [ 23/Jul/21 ]

Author:

{'name': 'Benety Goh', 'email': 'benety@mongodb.com', 'username': 'benety'}

Message: SERVER-58280 initial sync aborts index builds before applying collMod

This fixes initial sync issues during oplog application when indexes are
being hidden/unhidden - these index operations are encoded as collMod
commands.

(cherry picked from commit 8922a0ea148c2d883ce724190e0d20a2e2bfd253)
Branch: v5.0
https://github.com/mongodb/mongo/commit/79e845bce5fe7572af038fa1dcec48a4b932a6b2

Comment by Githook User [ 23/Jul/21 ]

Author:

{'name': 'Benety Goh', 'email': 'benety@mongodb.com', 'username': 'benety'}

Message: SERVER-58280 add base test for handling collMod during initial sync on collections with index buidls

The initial version of this test is identical to initial_sync_aborts_two_phase_index_builds.js.

(cherry picked from commit 7dc0e9f922fc025f8b6e6b7962398b9bd41f9570)
Branch: v5.0
https://github.com/mongodb/mongo/commit/ec7c30dbec28d9e6afbd9f6adf0cd279618fd315

Comment by Githook User [ 12/Jul/21 ]

Author:

{'name': 'Benety Goh', 'email': 'benety@mongodb.com', 'username': 'benety'}

Message: SERVER-58280 initial sync aborts index builds before applying collMod

This fixes initial sync issues during oplog application when indexes are
being hidden/unhidden - these index operations are encoded as collMod
commands.
Branch: master
https://github.com/mongodb/mongo/commit/8922a0ea148c2d883ce724190e0d20a2e2bfd253

Comment by Githook User [ 12/Jul/21 ]

Author:

{'name': 'Benety Goh', 'email': 'benety@mongodb.com', 'username': 'benety'}

Message: SERVER-58280 add base test for handling collMod during initial sync on collections with index buidls

The initial version of this test is identical to initial_sync_aborts_two_phase_index_builds.js.
Branch: master
https://github.com/mongodb/mongo/commit/7dc0e9f922fc025f8b6e6b7962398b9bd41f9570

Generated at Thu Feb 08 05:44:05 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.