Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-31956

Primary server keeps changing in cluster

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 3.4.9
    • Component/s: Replication
    • Labels:
      None
    • ALL

      Hello

      Im using mongodb version 3.4.9 , 1 primary server with 2 replicas .
      Each time I login to server i see a different primary server in logs i have only :
      a bunch of getMore messages :

      {{2017-11-14T12:18:54.686+0330 I COMMAND  [conn18] command local.oplog.rs command: getMore { getMore: 16697257165, collection: "oplog.rs", maxTimeMS: 5000, term: 21, lastKnownCommittedOpTime: { ts: Timestamp 1510649328000|1, t: 21 } } originatingCommand: { find: "oplog.rs", filter: { ts: { $gte: Timestamp 1510640408000|1 } }, tailable: true, oplogReplay: true, awaitData: true, maxTimeMS: 60000, term: 21 } planSummary: COLLSCAN cursorid:16697257165 keysExamined:0 docsExamined:0 numYields:1 nreturned:0 reslen:451 locks:{ Global: { acquireCount: { r: 6 } }, Database: { acquireCount: { r: 3 } }, oplog: { acquireCount: { r: 3 } } } protocol:op_command 5810ms}}
      
      serverStatus slow error : 
      {{2017-11-14T11:48:06.309+0330 I COMMAND  [ftdc] serverStatus was very slow: { after basic: 0, after asserts: 0, after backgroundFlushing: 0, after connections: 0, after dur: 0, after extra_info: 0, after globalLock: 0, after locks: 0, after network: 0, after opLatencies: 0, after opcounters: 0, after opcountersRepl: 0, after repl: 0, after security: 0, after storageEngine: 0, after tcmalloc: 0, after wiredTiger: 3572, at end: 3572 }}}
      

      and here is REPL errors :

      2017-11-13T13:41:28.649-0500 I REPL     [rsBackgroundSync] sync source candidate: 172.16.12.165:27017
      2017-11-13T13:43:33.609-0500 I REPL     [ReplicationExecutor] Starting an election, since we've seen no PRIMARY in the past 10000ms
      2017-11-13T13:43:33.609-0500 I REPL     [ReplicationExecutor] conducting a dry run election to see if we could be elected
      2017-11-13T13:43:33.610-0500 I REPL     [ReplicationExecutor] VoteRequester(term 18 dry run) received a yes vote from 172.16.12.166:27017; response message: { term: 18, voteGranted: true, reason: "", ok: 1.0 }
      2017-11-13T13:43:33.610-0500 I REPL     [ReplicationExecutor] dry election run succeeded, running for election
      2017-11-13T13:43:33.614-0500 I REPL     [ReplicationExecutor] VoteRequester(term 19) received a yes vote from 172.16.12.166:27017; response message: { term: 18, voteGranted: true, reason: "", ok: 1.0 }
      2017-11-13T13:43:33.614-0500 I REPL     [ReplicationExecutor] election succeeded, assuming primary role in term 19
      2017-11-13T13:43:33.614-0500 I REPL     [ReplicationExecutor] transition to PRIMARY
      2017-11-13T13:43:33.614-0500 I REPL     [ReplicationExecutor] Entering primary catch-up mode.
      2017-11-13T13:43:33.614-0500 I REPL     [ReplicationExecutor] Member 172.16.12.166:27017 is now in state SECONDARY
      2017-11-13T13:43:33.616-0500 I REPL     [ReplicationExecutor] Caught up to the latest optime known via heartbeats after becoming primary.
      2017-11-13T13:43:33.616-0500 I REPL     [ReplicationExecutor] Exited primary catch-up mode.
      2017-11-13T13:43:33.616-0500 I REPL     [rsBackgroundSync] Replication producer stopped after oplog fetcher finished returning a batch from our sync source.  Abandoning this batch of oplog entries and re-evaluating our sync source.
      2017-11-13T13:43:33.970-0500 I REPL     [SyncSourceFeedback] SyncSourceFeedback error sending update to 172.16.12.165:27017: InvalidSyncSource: Sync source was cleared. Was 172.16.12.165:27017
      2017-11-13T13:43:35.610-0500 I REPL     [rsSync] transition to primary complete; database writes are now permitted
      2017-11-13T13:44:16.527-0500 I REPL     [conn234] stepping down from primary, because a new term has begun: 20
      2017-11-13T13:44:16.527-0500 I REPL     [replExecDBWorker-2] transition to SECONDARY
      

      we dont have any load on servers its fresh installation :
      Ubuntu 16.04 with 8GB RAM each node and I have 5.5 free ram the servers are dedicated to mongodb and there is no other tools install on it .

      any suggestion .

        1. diagnostic-data-mongo1.tar.gz
          25.71 MB
        2. diagnostic-data-mongo2.tar.gz
          18.87 MB
        3. diagnostic-data-mongo3.tar.gz
          25.57 MB
        4. ftdc.png
          ftdc.png
          220 kB
        5. mongod1.log
          2.96 MB
        6. mongod2.log
          2.27 MB
        7. mongod3.log
          3.88 MB
        8. SERVER-31916.png
          SERVER-31916.png
          123 kB

            Assignee:
            mark.agarunov Mark Agarunov
            Reporter:
            arash Arash Shams
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated:
              Resolved: