[SERVER-49076] Add rollback fuzzer suites to resumable index build variant Created: 24/Jun/20  Updated: 29/Oct/23  Resolved: 25/Aug/20

Status: Closed
Project: Core Server
Component/s: Index Maintenance, Testing Infrastructure
Affects Version/s: None
Fix Version/s: 4.7.0

Type: Task Priority: Major - P3
Reporter: Samyukta Lanka Assignee: Gregory Noma
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
depends on SERVER-50446 make index builds non-resumable when ... Closed
Related
is related to SERVER-45001 enable commit quorum for two phase in... Closed
is related to SERVER-48476 resumable index build should use majo... Closed
is related to SERVER-49774 Enable rollback testing for resumable... Closed
is related to SERVER-50108 remove enableIndexBuildCommitQuorum s... Backlog
Backwards Compatibility: Fully Compatible
Sprint: Execution Team 2020-08-24, Execution Team 2020-09-07
Participants:

 Description   

SERVER-48476 is adding a wait during index builds that can cause a hang in some tests that use the rollback test fixture (which include the rollback fuzzer) when resumable index builds are enabled. As a part of this ticket, we should resolve these hangs.



 Comments   
Comment by Githook User [ 25/Aug/20 ]

Author:

{'name': 'Gregory Noma', 'email': 'gregory.noma@gmail.com', 'username': 'gregorynoma'}

Message: SERVER-49076 Add rollback fuzzer suites to resumable index builds variant
Branch: master
https://github.com/mongodb/mongo/commit/c8b7857db7d31c5b80dba7dd012223a108a95cfd

Comment by Benety Goh [ 20/Aug/20 ]

This ticket may not be feasible for the same reasons that resulting in SERVER-45001 configuring the rollback fuzzers to opt out of the commit quorum.

There is work on the backlog in SERVER-50108 to remove the opt-out flag for commit quorum.

Comment by Samyukta Lanka [ 24/Jun/20 ]

One cause of hangs in the rollback test fixture happens when replication is stopped on the tiebreaker node and the rollback node has completed rollback. The test fixture waits for the rollback node to catch up to the primary here. However, the rollback node is stuck while applying a commitIndexBuild oplog entry because the wait added by SERVER-48476 during startIndexBuild has not completed (because the majority commit point cannot advance until the tiebreaker node reconnects to the primary and replication is restarted). See these logs for more details.

Generated at Thu Feb 08 05:18:52 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.