Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-11304

SNMP: subagent with nojournal throws exception during snmpwalk

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • 2.5.4
    • Affects Version/s: 2.5.3
    • Component/s: Diagnostics
    • Labels:
    • Environment:
      * EC2, Amazon Linux, m1.medium
      * PSA replica set
      * primary mongod serving as SNMP, arbiter and secondary as SNMP subagents
    • Hide

      0.) Install standard pre-req packages for SNMP.
      1.) Edit /etc/snmp/mongod.conf config file for subagent connection through TCP:1705 (mongod.conf attached).
      2.) Launch a PSA replica set. The primary was SNMP master, the arbiter and secondary were subagents. (My test contained two distinct hosts with one host having both the primary and arbiter, however I'm guessing that's unrelated.)
      3.) Run snmpwalk of master (I was running snmpwalk remotely from my MBP to EC2)
      snmpwalk -m MONGO-MIB -v 2c -c mongodb ec2-foo-bar.amazonaws.com:1161

      Actual command-line flags used for each host:
      PRI: ./bin/mongod --replSet "oh boy" --snmp-master --port 37017 --fork --dbpath data/rs-0/ --logpath logs/rs-0/mongod.log --smallfiles --nohttpinterface --nojournal --oplogSize=50

      ARB: ./bin/mongod --replSet "oh boy" --snmp-subagent --port 37018 --fork --dbpath data/rs-1/ --logpath logs/rs-1/mongod.log --smallfiles --nohttpinterface --nojournal --oplogSize=50

      SEC: ./bin/mongod --replSet "oh boy" --snmp-subagent --port 37019 --fork --dbpath data/rs-2/ --logpath logs/rs-2/mongod.log --smallfiles --nohttpinterface --nojournal --oplogSize=50

      0.) Install standard pre-req packages for SNMP. 1.) Edit /etc/snmp/mongod.conf config file for subagent connection through TCP:1705 (mongod.conf attached). 2.) Launch a PSA replica set. The primary was SNMP master, the arbiter and secondary were subagents. (My test contained two distinct hosts with one host having both the primary and arbiter, however I'm guessing that's unrelated.) 3.) Run snmpwalk of master (I was running snmpwalk remotely from my MBP to EC2) snmpwalk -m MONGO-MIB -v 2c -c mongodb ec2-foo-bar.amazonaws.com:1161 Actual command-line flags used for each host: PRI: ./bin/mongod --replSet "oh boy" --snmp-master --port 37017 --fork --dbpath data/rs-0/ --logpath logs/rs-0/mongod.log --smallfiles --nohttpinterface --nojournal --oplogSize=50 ARB: ./bin/mongod --replSet "oh boy" --snmp-subagent --port 37018 --fork --dbpath data/rs-1/ --logpath logs/rs-1/mongod.log --smallfiles --nohttpinterface --nojournal --oplogSize=50 SEC: ./bin/mongod --replSet "oh boy" --snmp-subagent --port 37019 --fork --dbpath data/rs-2/ --logpath logs/rs-2/mongod.log --smallfiles --nohttpinterface --nojournal --oplogSize=50

      Given a standard PSA replica set, with the primary configured as SNMP master, and the other two nodes configured as SNMP subagents, I'm seeing the following exception traces in the arbiter's mongod log:

      [SnmpAgent] Assertion: 13111:field not found, expected type 16
      [SnmpAgent] 0xee29a6 0xe98b52 0xe815dc 0xe8173c 0x7e512e 0xb27afd 0xb2c38a 0xb28f36 0x7fcf88bb480c 0x7fcf8897bbb5 0x7fcf88bb480c 0x7fcf88980095 0x7fcf88bb426f 0x7fcf88ba7d31 0x7fcf88ba8ce0 0x7fcf88baa040 0x7fcf88baad8a 0x7fcf8828a83f 0x7fcf8828baa3 0x7fcf8828c1b9
       ./bin/mongod(_ZN5mongo15printStackTraceERSo+0x26) [0xee29a6]
       ./bin/mongod(_ZN5mongo10logContextEPKc+0x1a2) [0xe98b52]
       ./bin/mongod(_ZN5mongo11msgassertedEiPKc+0x11c) [0xe815dc]
       ./bin/mongod() [0xe8173c]
       ./bin/mongod(_ZNK5mongo11BSONElement3chkEi+0x11e) [0x7e512e]
       ./bin/mongod(_ZN5mongo18ServerStatusClient11getIntFieldERKNS_10StringDataE+0x2d) [0xb27afd]
       ./bin/mongod(_ZN5mongo9callbacks20ServerStatusCallback7respondEP13variable_list+0x5ea) [0xb2c38a]
       ./bin/mongod(_ZN5mongo16my_snmp_callbackEP21netsnmp_mib_handler_sP30netsnmp_handler_registration_sP28netsnmp_agent_request_info_sP22netsnmp_request_info_s+0x1d6) [0xb28f36]
       /usr/lib64/libnetsnmpagent.so.20(netsnmp_call_next_handler+0x1ac) [0x7fcf88bb480c]
       /usr/lib64/libnetsnmphelpers.so.20(netsnmp_instance_helper_handler+0x2c5) [0x7fcf8897bbb5]
       /usr/lib64/libnetsnmpagent.so.20(netsnmp_call_next_handler+0x1ac) [0x7fcf88bb480c]
       /usr/lib64/libnetsnmphelpers.so.20(netsnmp_serialize_helper_handler+0x55) [0x7fcf88980095]
       /usr/lib64/libnetsnmpagent.so.20(netsnmp_call_handlers+0xdf) [0x7fcf88bb426f]
       /usr/lib64/libnetsnmpagent.so.20(handle_var_requests+0x91) [0x7fcf88ba7d31]
       /usr/lib64/libnetsnmpagent.so.20(handle_getnext_loop+0x20) [0x7fcf88ba8ce0]
       /usr/lib64/libnetsnmpagent.so.20(netsnmp_handle_request+0x90) [0x7fcf88baa040]
       /usr/lib64/libnetsnmpagent.so.20(handle_snmp_packet+0x1ca) [0x7fcf88baad8a]
       /usr/lib64/libnetsnmp.so.20(+0x4083f) [0x7fcf8828a83f]
       /usr/lib64/libnetsnmp.so.20(_sess_read+0x893) [0x7fcf8828baa3]
       /usr/lib64/libnetsnmp.so.20(snmp_sess_read2+0x9) [0x7fcf8828c1b9]

      The output of the snmpwalk looks like following:

      MONGO-MIB::cursorClientSize."37017" = INTEGER: 1
      MONGO-MIB::cursorClientSize."37018" = INTEGER: 0
      MONGO-MIB::cursorTimedOut."37017" = INTEGER: 121
      MONGO-MIB::cursorTimedOut."37018" = INTEGER: 0
      Error in packet.
      Reason: (genError) A general failure occured
      Failed object: MONGO-MIB::cursorTimedOut."37018"

      Logs attached at log level 5, as well as the SNMP mongod.conf

      Based on the log output this appears to be related to running the mongod without journaling.

        1. arbiter.mongod.log
          357 kB
          John Morales
        2. primary.mongod.log
          241 kB
          John Morales
        3. secondary.mongod.log
          180 kB
          John Morales
        4. snmp.mongod.conf
          1 kB
          John Morales

            james.wahlin@mongodb.com James Wahlin
            john.morales@mongodb.com John Morales (Inactive)
            0 Vote for this issue
            3 Start watching this issue
