[SERVER-11304] SNMP: subagent with nojournal throws exception during snmpwalk Created: 22/Oct/13  Updated: 11/Jul/16  Resolved: 29/Oct/13

Status: Closed
Project: Core Server
Component/s: Diagnostics
Affects Version/s: 2.5.3
Fix Version/s: 2.5.4

Type: Bug Priority: Major - P3
Reporter: John Morales Assignee: James Wahlin
Resolution: Done Votes: 0
Labels: 26qa
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:
  • EC2, Amazon Linux, m1.medium
  • PSA replica set
  • primary mongod serving as SNMP, arbiter and secondary as SNMP subagents

Attachments: Text File arbiter.mongod.log     Text File primary.mongod.log     Text File secondary.mongod.log     File snmp.mongod.conf    
Issue Links:
Related
Steps To Reproduce:

0.) Install standard pre-req packages for SNMP.
1.) Edit /etc/snmp/mongod.conf config file for subagent connection through TCP:1705 (mongod.conf attached).
2.) Launch a PSA replica set. The primary was SNMP master, the arbiter and secondary were subagents. (My test contained two distinct hosts with one host having both the primary and arbiter, however I'm guessing that's unrelated.)
3.) Run snmpwalk of master (I was running snmpwalk remotely from my MBP to EC2)
snmpwalk -m MONGO-MIB -v 2c -c mongodb ec2-foo-bar.amazonaws.com:1161 1.3.6.1.4.1.34601

Actual command-line flags used for each host:
PRI: ./bin/mongod --replSet "oh boy" --snmp-master --port 37017 --fork --dbpath data/rs-0/ --logpath logs/rs-0/mongod.log --smallfiles --nohttpinterface --nojournal --oplogSize=50

ARB: ./bin/mongod --replSet "oh boy" --snmp-subagent --port 37018 --fork --dbpath data/rs-1/ --logpath logs/rs-1/mongod.log --smallfiles --nohttpinterface --nojournal --oplogSize=50

SEC: ./bin/mongod --replSet "oh boy" --snmp-subagent --port 37019 --fork --dbpath data/rs-2/ --logpath logs/rs-2/mongod.log --smallfiles --nohttpinterface --nojournal --oplogSize=50

Participants:

 Description   

Given a standard PSA replica set, with the primary configured as SNMP master, and the other two nodes configured as SNMP subagents, I'm seeing the following exception traces in the arbiter's mongod log:

[SnmpAgent] Assertion: 13111:field not found, expected type 16
[SnmpAgent] 0xee29a6 0xe98b52 0xe815dc 0xe8173c 0x7e512e 0xb27afd 0xb2c38a 0xb28f36 0x7fcf88bb480c 0x7fcf8897bbb5 0x7fcf88bb480c 0x7fcf88980095 0x7fcf88bb426f 0x7fcf88ba7d31 0x7fcf88ba8ce0 0x7fcf88baa040 0x7fcf88baad8a 0x7fcf8828a83f 0x7fcf8828baa3 0x7fcf8828c1b9
 ./bin/mongod(_ZN5mongo15printStackTraceERSo+0x26) [0xee29a6]
 ./bin/mongod(_ZN5mongo10logContextEPKc+0x1a2) [0xe98b52]
 ./bin/mongod(_ZN5mongo11msgassertedEiPKc+0x11c) [0xe815dc]
 ./bin/mongod() [0xe8173c]
 ./bin/mongod(_ZNK5mongo11BSONElement3chkEi+0x11e) [0x7e512e]
 ./bin/mongod(_ZN5mongo18ServerStatusClient11getIntFieldERKNS_10StringDataE+0x2d) [0xb27afd]
 ./bin/mongod(_ZN5mongo9callbacks20ServerStatusCallback7respondEP13variable_list+0x5ea) [0xb2c38a]
 ./bin/mongod(_ZN5mongo16my_snmp_callbackEP21netsnmp_mib_handler_sP30netsnmp_handler_registration_sP28netsnmp_agent_request_info_sP22netsnmp_request_info_s+0x1d6) [0xb28f36]
 /usr/lib64/libnetsnmpagent.so.20(netsnmp_call_next_handler+0x1ac) [0x7fcf88bb480c]
 /usr/lib64/libnetsnmphelpers.so.20(netsnmp_instance_helper_handler+0x2c5) [0x7fcf8897bbb5]
 /usr/lib64/libnetsnmpagent.so.20(netsnmp_call_next_handler+0x1ac) [0x7fcf88bb480c]
 /usr/lib64/libnetsnmphelpers.so.20(netsnmp_serialize_helper_handler+0x55) [0x7fcf88980095]
 /usr/lib64/libnetsnmpagent.so.20(netsnmp_call_handlers+0xdf) [0x7fcf88bb426f]
 /usr/lib64/libnetsnmpagent.so.20(handle_var_requests+0x91) [0x7fcf88ba7d31]
 /usr/lib64/libnetsnmpagent.so.20(handle_getnext_loop+0x20) [0x7fcf88ba8ce0]
 /usr/lib64/libnetsnmpagent.so.20(netsnmp_handle_request+0x90) [0x7fcf88baa040]
 /usr/lib64/libnetsnmpagent.so.20(handle_snmp_packet+0x1ca) [0x7fcf88baad8a]
 /usr/lib64/libnetsnmp.so.20(+0x4083f) [0x7fcf8828a83f]
 /usr/lib64/libnetsnmp.so.20(_sess_read+0x893) [0x7fcf8828baa3]
 /usr/lib64/libnetsnmp.so.20(snmp_sess_read2+0x9) [0x7fcf8828c1b9]

The output of the snmpwalk looks like following:

...
MONGO-MIB::cursorClientSize."37017" = INTEGER: 1
MONGO-MIB::cursorClientSize."37018" = INTEGER: 0
MONGO-MIB::cursorTimedOut."37017" = INTEGER: 121
MONGO-MIB::cursorTimedOut."37018" = INTEGER: 0
Error in packet.
Reason: (genError) A general failure occured
Failed object: MONGO-MIB::cursorTimedOut."37018"

Logs attached at log level 5, as well as the SNMP mongod.conf

Based on the log output this appears to be related to running the mongod without journaling.



 Comments   
Comment by auto [ 29/Oct/13 ]

Author:

{u'username': u'milkie', u'name': u'Eric Milkie', u'email': u'milkie@10gen.com'}

Message: Merge pull request #9 from jameswahlin/SERVER-11304

SERVER-11304 SNMP dur stats only when journaling
Branch: master
https://github.com/10gen/mongo-enterprise-modules/commit/1ebf74dd145ef2d2b29fcc61e12982530c84a2cc

Comment by auto [ 29/Oct/13 ]

Author:

{u'username': u'milkie', u'name': u'Eric Milkie', u'email': u'milkie@10gen.com'}

Message: Merge pull request #9 from jameswahlin/SERVER-11304

SERVER-11304 SNMP dur stats only when journaling
Branch: master
https://github.com/10gen/mongo-enterprise-modules/commit/1ebf74dd145ef2d2b29fcc61e12982530c84a2cc

Comment by auto [ 29/Oct/13 ]

Author:

{u'name': u'James Wahlin', u'email': u'james.wahlin@10gen.com'}

Message: SERVER-11304 SNMP dur stats only when journaling
Branch: master
https://github.com/10gen/mongo-enterprise-modules/commit/60c63408a1db13019ad8b765d8039388a55e7513

Generated at Thu Feb 08 03:25:25 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.