[SERVER-20985] Replica set primary unexpectedly stepping down Created: 16/Oct/15  Updated: 24/Nov/15  Resolved: 24/Nov/15

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: 3.2.0-rc0
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Jeffrey Yemin Assignee: Eric Milkie
Resolution: Incomplete Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Amazon Elastic Slaves on Jenkins

$ uname -a
Linux ip-10-67-163-68 4.1.7-15.23.amzn1.x86_64 #1 SMP Mon Sep 14 23:20:33 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

Three member replica set, all running on the same host.

$ /mnt/jenkins/mongodb/master/master-nightly/bin/mongod -version
db version v3.2.0-rc1-pre-
git version: 9ef3d91b627cd912b0475150112212cf9604704f
OpenSSL version: OpenSSL 1.0.0-fips 29 Mar 2010
allocator: tcmalloc
modules: enterprise


Attachments: Zip Archive c-driver-test-logs.zip     Text File db27017.log     File db27017.log.2015-11-05T15-46-46     Text File db27018.log     File db27018.log.2015-11-05T15-46-47     Text File db27019.log     File db27019.log.2015-11-05T15-46-47    
Operating System: ALL
Sprint: Repl B (10/30/15), Repl C (11/20/15), Repl D (12/11/15)
Participants:

 Description   

Occasionally the primary of a replica set is stepping down and then quickly getting re-elected.

This is causing some driver tests to occasionally fail, as there is no retry-logic in the integration tests, and connections to the primary are closed on stepdown.

Logs attached



 Comments   
Comment by Eric Milkie [ 24/Nov/15 ]

Note that what Jesse said about lower-priority nodes is true for 3.0 as well; it's just more likely for version 3.2 now.

Comment by A. Jesse Jiryu Davis [ 24/Nov/15 ]

As Milkie pointed out to me, we had priorities configured in the set. In MongoDB 3.2 it seems a lower-priority node can start as primary and then, after a couple seconds, steps down to let the higher-priority node be elected. This was causing our intermittent C Driver test failures; I've removed the priorities from our configs.

Comment by Eric Milkie [ 23/Nov/15 ]

I think I agree with Jeff that the Java driver issue isn't a regression from 3.0. I'm inclined to resolve this for now.

Comment by Daniel Pasette (Inactive) [ 23/Nov/15 ]

This seems different than the Java driver issue. Have we gained access to those logs?

Comment by A. Jesse Jiryu Davis [ 23/Nov/15 ]

I see. During that in-between time, the test suite fails abruptly when the driver fails to find a primary, and then all the processes are killed, so the new leader doesn't have enough time to reach the "database writes are now permitted" stage.

Comment by Eric Milkie [ 23/Nov/15 ]

27017 may be in state "primary" but the response to isMaster will not return ismaster:true until you see this line in the server log, after the "transition to PRIMARY" line:

[rsSync] transition to primary complete; database writes are now permitted

I don't see that message in the log you attached, so I would not expect any driver to detect that new primary yet.

Comment by A. Jesse Jiryu Davis [ 23/Nov/15 ]

Logs attached.

The replica set starts up with data members on 27017 and 27018, and 27019 is an arbiter. 27018 happens to become the primary. The test suite runs for a few seconds, completing several tests successfully. (E.g., "/Client/command" requires the driver to find the primary and auth to it.)

From 13:50:32.610 to 13:50:34.695 the test suite does mock-server tests, concluding with "/Client/recovering", a mock-server test that takes about 5 seconds. I believe none of those tests interacts with the actual replica set.

At 13:50:38.836, the primary on 27018 logs:

2015-11-23T18:50:38.836+0000 I REPL     [ReplicationExecutor] stepping down from primary, because a new term has begun: 3
2015-11-23T18:50:38.837+0000 I REPL     [replExecDBWorker-2] transition to SECONDARY

The secondary on 27017 logs:

2015-11-23T18:50:38.836+0000 I REPL     [ReplicationExecutor] Canceling priority takeover callback
2015-11-23T18:50:38.836+0000 I REPL     [ReplicationExecutor] Starting an election for a priority takeover
2015-11-23T18:50:38.836+0000 I REPL     [ReplicationExecutor] conducting a dry run election to see if we could be elected
2015-11-23T18:50:38.836+0000 I REPL     [ReplicationExecutor] dry election run succeeded, running for election
2015-11-23T18:50:38.837+0000 I REPL     [ReplicationExecutor] election succeeded, assuming primary role in term 3
2015-11-23T18:50:38.837+0000 I REPL     [ReplicationExecutor] transition to PRIMARY
2015-11-23T18:50:38.854+0000 I ASIO     [NetworkInterfaceASIO-0] Successfully connected to localhost:27018
2015-11-23T18:50:38.854+0000 I REPL     [ReplicationExecutor] Member localhost:27018 is now in state SECONDARY

The test suite begins "/Client/server_status" at 13:50:39.196:

Link to test suite log

It creates a fresh client with this URI:

"mongodb://bob:pwd123@localhost:27017,localhost:27018/?replicaSet=repl0"

It tries to connect to a primary, but fails. It's using the "serverSelectionTryOnce" method we made for single-threaded clients, where it checks each server only once instead of retrying for 30 seconds. It seems like it ought to be able to find the primary: 27017 is primary by this point.

Comment by A. Jesse Jiryu Davis [ 23/Nov/15 ]

I'm attempting a repro in Evergreen now to get you those logs.

Comment by Eric Milkie [ 23/Nov/15 ]

Is there a chance we'll get server logs for a failed test for the C driver? I see the replica sets test still has a >50% fail rate with the latest build.

Comment by Jeffrey Yemin [ 20/Nov/15 ]

I've been monitoring our Jenkins builds for the last week, and I'm actually seeing this happen more against released versions of the server and let against 3.2 candidates. For instance, there was a stepdown in a 3.0 replica set this morning that looks very similar. I've also seen it with 2.4 and 2.6. So at least in my testing this is looking more environmental now and less of a specific issue related to the new replication protocol in 3.2.

Comment by Eric Milkie [ 19/Nov/15 ]

it is not related

Comment by Daniel Pasette (Inactive) [ 19/Nov/15 ]

is this related at all to SERVER-21533?

Comment by Eric Milkie [ 17/Nov/15 ]

As of today's nightly build (and rc3 which we cut today), the server logs should contain a bit more information to help us diagnose replica set failover. Let me know when the next failover happens in Jenkins and I'll take a look.

Comment by Jeffrey Yemin [ 17/Nov/15 ]

On Jenkins I have it configured to archive the server logs for each build. See, for example, https://jenkins.10gen.com/job/mongo-java-driver-3.x-test-server-nightly/jdk=HotSpot7,label=linux64,mongodb_configuration=replica_set,mongodb_option=noauth,mongodb_server=master-nightly,mongodb_ssl=nossl/.

Comment by A. Jesse Jiryu Davis [ 17/Nov/15 ]

Yeah, it cats the mongo orchestration log on shutdown, but right now we lose replica set logs. IWBNI we had access to those, I'll see if that's easy to do.

Comment by Eric Milkie [ 17/Nov/15 ]

Where does mongo-orchestration place the server logs? In some of the tests it seems to do "cat server.log" as part of its workflow but I don't see any output from a mongod process.

Comment by A. Jesse Jiryu Davis [ 16/Nov/15 ]

This is common enough in the C driver's test suite that tests against master-nightly replica sets fail most of the time in Evergreen:

https://evergreen.mongodb.com/version/mongo_c_driver_c1f32e9f6fa2b71945f192de7196f9ffbea162dc

But notice that the stepdowns aren't observed with 2.4, 2.6, or 3.0.

Comment by Jeffrey Yemin [ 05/Nov/15 ]

New logs while running server version:

db version v3.2.0-rc2-20-ge29401f
git version: e29401fe31d95322d7b2ab58dfc94ce04f27d5a1

The first test failure due to a closed socket was around between 15:38 and 15:39.

Comment by Eric Milkie [ 05/Nov/15 ]

Thanks! Do you have a recent Jenkins link for an instance of stepping down? I'd like to see what the logs have first, and then decide what sort of logging verbosity we might need to diagnose it further.

Comment by Jeffrey Yemin [ 04/Nov/15 ]

master-release is now rc2, master-nightly is a bit newer:

[ec2-user@ip-10-167-153-159 master]$ ./master-nightly/bin/mongod -version | grep version
db version v3.2.0-rc2-3-gba49b4b
git version: ba49b4b6e0b9116d2dd7347961835f38ac32320e
OpenSSL version: OpenSSL 1.0.0-fips 29 Mar 2010
 
[ec2-user@ip-10-167-153-159 master]$ ./master-release/bin/mongod -version | grep version
db version v3.2.0-rc2
git version: 8a3acb42742182c5e314636041c2df368232bbc5
OpenSSL version: OpenSSL 1.0.0-fips 29 Mar 2010

Comment by Eric Milkie [ 04/Nov/15 ]

What mongod version is Jenkins currently running for "nightly"; does it have a build from rc2?

Comment by Jeffrey Yemin [ 19/Oct/15 ]

There is, but I've never been able to get this to happen locally, only on Jenkins slaves, and even there, it's somewhat rare.

Comment by Matt Dannenberg [ 19/Oct/15 ]

Is there a way to run these tests locally? I'd like to repro this, if possible.

Comment by Jeffrey Yemin [ 16/Oct/15 ]

I started my search of the logs with:

/usr/bin/grep REPL db27017.log | grep -v "has reached optime" | grep "2015-10-16T19:11"

Generated at Thu Feb 08 03:55:53 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.