[SERVER-19723] auth2.js causes replica to transition to SECONDARY between RECOVERING Created: 03/Aug/15  Updated: 14/Apr/16  Resolved: 03/Aug/15

Status: Closed
Project: Core Server
Component/s: Replication, Security
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Mathias Stearn Assignee: Matt Dannenberg
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Backwards Compatibility: Fully Compatible
Operating System: ALL
Participants:

 Description   

Seems to be related to an auth failure which is reported as connection attempt failed.

 2015-08-03T12:49:05.753-0400 I CONTROL  [main] ** WARNING: --rest is specified without --httpinterface,
 2015-08-03T12:49:05.753-0400 I CONTROL  [main] **          enabling http interface
 note: noprealloc may hurt performance in many applications
 2015-08-03T12:49:05.786-0400 I -        [initandlisten] Detected data files in /data/db/job0/mongorunner/rs_auth2-1 created by the 'wiredTiger' storage engine, so setting the active storage engine to 'wiredTiger'.
 2015-08-03T12:49:05.786-0400 I STORAGE  [initandlisten] wiredtiger_open config: create,cache_size=18G,session_max=20000,eviction=(threads_max=4),statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),file_manager=(close_idle_time=100000),checkpoint=(wait=60,log_size=2GB),statistics_log=(wait=0),
 2015-08-03T12:49:05.823-0400 I STORAGE  [initandlisten] Starting WiredTigerRecordStoreThread local.oplog.rs
 2015-08-03T12:49:05.832-0400 W STORAGE  [initandlisten] Detected configuration for non-active storage engine mmapv1 when current storage engine is wiredTiger
 2015-08-03T12:49:05.832-0400 I CONTROL  [initandlisten] MongoDB starting : pid=31618 port=31001 dbpath=/data/db/job0/mongorunner/rs_auth2-1 64-bit host=redbeard
 2015-08-03T12:49:05.832-0400 I CONTROL  [initandlisten]
 2015-08-03T12:49:05.832-0400 I CONTROL  [initandlisten] ** NOTE: This is a development version (3.1.7-pre-) of MongoDB.
 2015-08-03T12:49:05.832-0400 I CONTROL  [initandlisten] **       Not recommended for production.
 2015-08-03T12:49:05.832-0400 I CONTROL  [initandlisten]
 2015-08-03T12:49:05.832-0400 I CONTROL  [initandlisten]
 2015-08-03T12:49:05.832-0400 I CONTROL  [initandlisten] ** WARNING: /sys/kernel/mm/transparent_hugepage/enabled is 'always'.
 2015-08-03T12:49:05.832-0400 I CONTROL  [initandlisten] **        We suggest setting it to 'never'
 2015-08-03T12:49:05.832-0400 I CONTROL  [initandlisten]
 2015-08-03T12:49:05.832-0400 I CONTROL  [initandlisten] ** WARNING: /sys/kernel/mm/transparent_hugepage/defrag is 'always'.
 2015-08-03T12:49:05.832-0400 I CONTROL  [initandlisten] **        We suggest setting it to 'never'
 2015-08-03T12:49:05.832-0400 I CONTROL  [initandlisten]
 2015-08-03T12:49:05.832-0400 I CONTROL  [initandlisten] db version v3.1.7-pre-
 2015-08-03T12:49:05.832-0400 I CONTROL  [initandlisten] git version: bc4572c985a6d9dca5721ec45e88a19061f3842e
 2015-08-03T12:49:05.832-0400 I CONTROL  [initandlisten] allocator: tcmalloc
 2015-08-03T12:49:05.832-0400 I CONTROL  [initandlisten] options: { net: { http: { RESTInterfaceEnabled: true, enabled: true }, port: 31001 }, replication: { oplogSizeMB: 40, replSet: "rs_auth2" }, security: { keyFile: "jstests/libs/key1" }, setParameter: { enableTestCommands: "1" }, storage: { dbPath: "/data/db/job0/mongorunner/rs_auth2-1", mmapv1: { preallocDataFiles: false, smallFiles: true } } }
 2015-08-03T12:49:05.834-0400 I NETWORK  [websvr] admin web console waiting for connections on port 32001
 2015-08-03T12:49:05.834-0400 I REPL     [initandlisten] Did not find local voted for document at startup;  NoMatchingDocument Did not find replica set lastVote document in local.replset.election
 2015-08-03T12:49:05.834-0400 I NETWORK  [initandlisten] waiting for connections on port 31001
 2015-08-03T12:49:05.836-0400 I ACCESS   [ReplicationExecutor] can't authenticate to redbeard:31000 (127.0.0.1) as internal user, error: Authentication failed.
 2015-08-03T12:49:05.836-0400 W NETWORK  [ReplicationExecutor] Failed to connect to 127.0.0.1:31002, reason: errno:111 Connection refused
 2015-08-03T12:49:05.836-0400 I REPL     [ReplicationExecutor] New replica set config in use: { _id: "rs_auth2", version: 1, configServer: false, members: [ { _id: 0, host: "redbeard:31000", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 2.0, tags: {}, slaveDelay: 0, votes: 1 }, { _id: 1, host: "redbeard:31001", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 0.0, tags: {}, slaveDelay: 0, votes: 1 }, { _id: 2, host: "redbeard:31002", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 0.0, tags: {}, slaveDelay: 0, votes: 1 } ], settings: { chainingAllowed: true, heartbeatTimeoutSecs: 10, getLastErrorModes: {}, getLastErrorDefaults: { w: 1, wtimeout: 0 }, protocolVersion: 0 } }
 2015-08-03T12:49:05.836-0400 I REPL     [ReplicationExecutor] This node is redbeard:31001 in the config
 2015-08-03T12:49:05.836-0400 I REPL     [ReplicationExecutor] transition to STARTUP2
 2015-08-03T12:49:05.836-0400 I REPL     [ReplicationExecutor] Starting replication applier threads
 2015-08-03T12:49:05.836-0400 I REPL     [ReplicationExecutor] transition to RECOVERING
 2015-08-03T12:49:05.836-0400 W NETWORK  [ReplExecNetThread-2] Failed to connect to 127.0.0.1:31002, reason: errno:111 Connection refused
 2015-08-03T12:49:05.836-0400 I REPL     [ReplicationExecutor] Error in heartbeat request to redbeard:31002; HostUnreachable couldn't connect to server redbeard:31002, connection attempt failed
 2015-08-03T12:49:05.837-0400 W NETWORK  [ReplExecNetThread-2] Failed to connect to 127.0.0.1:31002, reason: errno:111 Connection refused
 2015-08-03T12:49:05.837-0400 I REPL     [ReplicationExecutor] Error in heartbeat request to redbeard:31002; HostUnreachable couldn't connect to server redbeard:31002, connection attempt failed
 2015-08-03T12:49:05.837-0400 W NETWORK  [ReplExecNetThread-2] Failed to connect to 127.0.0.1:31002, reason: errno:111 Connection refused
 2015-08-03T12:49:05.837-0400 I REPL     [ReplicationExecutor] Error in heartbeat request to redbeard:31002; HostUnreachable couldn't connect to server redbeard:31002, connection attempt failed
 2015-08-03T12:49:05.837-0400 I REPL     [ReplicationExecutor] transition to SECONDARY
 2015-08-03T12:49:05.837-0400 I REPL     [ReplicationExecutor] Error in heartbeat request to redbeard:31000; AuthenticationFailed Authentication failed.
 2015-08-03T12:49:05.837-0400 I REPL     [ReplicationExecutor] transition to RECOVERING



 Comments   
Comment by Eric Milkie [ 03/Aug/15 ]

It's the correct behavior; thanks for finding that.

Comment by Matt Dannenberg [ 03/Aug/15 ]

I believe it is falling back into RECOVERING because it received an AuthenticationFailed with no successful heartbeats (see here). As such, I'm not certain what the correct behavior should be here.

Generated at Thu Feb 08 03:51:52 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.