Details
-
Bug
-
Resolution: Duplicate
-
Major - P3
-
None
-
3.0.0
-
None
-
Fully Compatible
-
Windows
-
Security 1 04/03/15
Description
We are encountering unexpected elections when running the .NET driver unit tests against a replica set (with auth and ssl if that makes a difference).
Apparently running the dropDatabase command can cause an unexpected election to occur.
From the attached logs:
On member 27017:
2015-04-01T16:22:18.013+0000 D REPL [ReplicationExecutor] localhost:27018 thinks that we are down because they cannot send us heartbeats.
|
2015-04-01T16:22:18.013+0000 I REPL [ReplicationExecutor] Member localhost:27018 is now in state PRIMARY
|
2015-04-01T16:22:18.013+0000 I REPL [ReplicationExecutor] replset: another primary seen with election time 551c1b39:1 my election time is 551c1ab4:1
|
2015-04-01T16:22:18.013+0000 I REPL [ReplicationExecutor] stepping down; another primary was elected more recently
|
2015-04-01T16:22:18.013+0000 D REPL [ReplicationExecutor] Scheduling heartbeat to localhost:27018 at 2015-04-01T16:22:20.013Z
|
2015-04-01T16:22:18.013+0000 I REPL [ReplicationExecutor] Stepping down from primary in response to heartbeat
|
And on member 27018:
2015-04-01T16:22:16.553+0000 I NETWORK [ReplExecNetThread-0] Socket recv() timeout 127.0.0.1:27017
|
2015-04-01T16:22:16.553+0000 I NETWORK [ReplExecNetThread-0] SocketException: remote: 127.0.0.1:27017 error: 9001 socket exception [RECV_TIMEOUT] server [127.0.0.1:27017]
|
2015-04-01T16:22:16.553+0000 I NETWORK [ReplExecNetThread-0] DBClientCursor::init call() failed
|
2015-04-01T16:22:16.553+0000 D - [ReplExecNetThread-0] User Assertion: 10276:DBClientBase::findN: transport error: localhost:27017 ns: local.$cmd query: { saslStart: 1, mechanism: "SCRAM-SHA-1", payload: BinData(0, 6E2C2C6E3D5F5F73797374656D2C723D534D4E376A716762326245447A315847506C4D2B50695A6E2B6E627338315834) }
|
2015-04-01T16:22:16.740+0000 D REPL [ReplExecNetThread-0] Network status of sending replSetHeartbeat to localhost:27017 was Location10276 DBClientBase::findN: transport error: localhost:27017 ns: local.$cmd query: { saslStart: 1, mechanism: "SCRAM-SHA-1", payload: BinData(0, 6E2C2C6E3D5F5F73797374656D2C723D534D4E376A716762326245447A315847506C4D2B50695A6E2B6E627338315834) }
|
2015-04-01T16:22:16.740+0000 I REPL [ReplicationExecutor] Error in heartbeat request to localhost:27017; Location10276 DBClientBase::findN: transport error: localhost:27017 ns: local.$cmd query: { saslStart: 1, mechanism: "SCRAM-SHA-1", payload: BinData(0, 6E2C2C6E3D5F5F73797374656D2C723D534D4E376A716762326245447A315847506C4D2B50695A6E2B6E627338315834) }
|
2015-04-01T16:22:16.740+0000 D REPL [ReplicationExecutor] Scheduling heartbeat to localhost:27017 at 2015-04-01T16:22:18.740Z
|
2015-04-01T16:22:16.740+0000 I REPL [ReplicationExecutor] Standing for election
|
...
|
2015-04-01T16:22:17.209+0000 I REPL [ReplicationExecutor] replSet election succeeded, assuming primary role
|
2015-04-01T16:22:17.209+0000 I REPL [ReplicationExecutor] transition to PRIMARY
|
So 27018 has taken the role of primary away from 27017.
Here's what seems to be bogging member 27017 down so much that 27018 concluded it was down:
2015-04-01T16:21:59.390+0000 D COMMAND [conn27] run command Tests04011621.$cmd { dropDatabase: 1 }
|
2015-04-01T16:21:59.390+0000 I COMMAND [conn27] dropDatabase Tests04011621 starting
|
2015-04-01T16:21:59.390+0000 D STORAGE [conn27] dropDatabase Tests04011621
|
...
|
2015-04-01T16:22:18.777+0000 I JOURNAL [conn27] journalCleanup...
|
2015-04-01T16:22:18.778+0000 I JOURNAL [conn27] removeJournalFiles
|
2015-04-01T16:22:18.778+0000 I STORAGE [DataFileSync] flushing mmaps took 17177ms for 10 files
|
2015-04-01T16:22:18.814+0000 D JOURNAL [conn27] removeJournalFiles end
|
2015-04-01T16:22:19.019+0000 I JOURNAL [conn27] journalCleanup...
|
2015-04-01T16:22:19.019+0000 I JOURNAL [conn27] removeJournalFiles
|
2015-04-01T16:22:19.019+0000 D JOURNAL [conn27] removeJournalFiles end
|
2015-04-01T16:22:19.047+0000 D STORAGE [conn27] remove file D:\jenkins\workspace\mongo-csharp-driver-test-windows\label\windows64\mc\replica_set\mo\auth\ms\30-release\mssl\ssl\artifacts\data\db27017\Tests04011621.ns
|
2015-04-01T16:22:19.156+0000 I COMMAND [conn27] dropDatabase Tests04011621 finished
|
2015-04-01T16:22:19.174+0000 I COMMAND [conn27] command Tests04011621.$cmd command: dropDatabase { dropDatabase: 1 } keyUpdates:0 writeConflicts:0 numYields:0 reslen:64 locks:{} 19784ms
|
Attachments
Issue Links
- duplicates
-
SERVER-12235 Don't require a database read on every new localhost connection when auth is on
-
- Closed
-