-
Type:
Bug
-
Resolution: Won't Fix
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
None
-
Fully Compatible
-
ALL
-
Repl 2017-10-02, Repl 2017-10-23
-
0
-
None
-
None
-
None
-
None
-
None
-
None
-
None
There is a race in this test between the thread started in stepDown_nonBlocking() and the call to ReplicationCoordinator::stepDown():
replication_coordinator_impl_test.cpp
TEST_F(StepDownTest, OnlyOneStepDownCmdIsAllowedAtATime) {
OpTime optime1(Timestamp(100, 1), 1);
OpTime optime2(Timestamp(100, 2), 1);
// No secondary is caught up
auto repl = getReplCoord();
repl->setMyLastAppliedOpTime(optime2);
repl->setMyLastDurableOpTime(optime2);
ASSERT_OK(repl->setLastAppliedOptime_forTest(1, 1, optime1));
ASSERT_OK(repl->setLastAppliedOptime_forTest(1, 2, optime1));
simulateSuccessfulV1Election();
ASSERT_TRUE(getReplCoord()->getMemberState().primary());
// Step down where the secondary actually has to catch up before the stepDown can succeed.
// On entering the network, _stepDownContinue should cancel the heartbeats scheduled for
// T + 2 seconds and send out a new round of heartbeats immediately.
// This makes it unnecessary to advance the clock after entering the network to process
// the heartbeat requests.
auto result = stepDown_nonBlocking(false, Seconds(10), Seconds(60));
// Now while the first stepdown request is waiting for secondaries to catch up, attempt another
// stepdown request and ensure it fails.
const auto opCtx = makeOperationContext();
auto status = getReplCoord()->stepDown(opCtx.get(), false, Seconds(10), Seconds(60));
ASSERT_EQUALS(ErrorCodes::ConflictingOperationInProgress, status);
// Now ensure that the original stepdown command can still succeed.
catchUpSecondaries(optime2);
ASSERT_OK(*result.second.get());
ASSERT_TRUE(repl->getMemberState().secondary());
}
If the main test thread attempts to call stepDown() before the TopologyCoordinator enters the attempingToStepDown state, this test will block.
- is related to
-
SERVER-28544 Stepdown command must take global lock in exclusive mode
-
- Closed
-
-
SERVER-31341 Synchronize unit tests that wait for asynchronous stepdown attempts
-
- Closed
-