[SERVER-7106] Invalid Data Inserted into Profiling Collection, Breaking Dex and count function Created: 21/Sep/12  Updated: 15/Feb/13  Resolved: 13/Dec/12

Status: Closed
Project: Core Server
Component/s: Logging
Affects Version/s: 2.2.0
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Kyle Gato Assignee: Tad Marshall
Resolution: Duplicate Votes: 0
Labels: collection, profiling
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Linux web-stage-1 3.0.0-15-virtual #25-Ubuntu SMP Mon Jan 2 20:45:15 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux

Running Ubuntu
$ cat /etc/debian_version
wheezy/sid

On Amazon EC2. Condensed MongoDB Cluster w/ 9 Daemons

db version v2.2.0, pdfile version 4.5
Thu Sep 20 12:04:51 git version: f5e83ea

kyle@db-stage-1:~$ pip freeze dex
Warning: cannot find svn location for distribute==0.6.16dev-r0
BeautifulSoup==3.2.1
Cheetah==2.4.4
Dex==0.4.1
GnuPGInterface==0.3.2
Landscape-Client==11.07.1.1
M2Crypto==0.20.1
PAM==0.4.2
PyYAML==3.10
Twisted-Core==11.0.0
Twisted-Web==11.0.0
apt-xapian-index==0.44
argparse==1.2.1
boto==2.0
chardet==2.0.1
cloud-init==0.6.1
command-not-found==0.2.44
configobj==4.7.2
dargparse==0.2.1

FIXME: could not find svn URL in dependency_links for this package:
distribute==0.6.16dev-r0
euca2ools==2.0.0
httplib2==0.7.1
iotop==0.4.3
keyring==0.6.2
launchpadlib==1.9.8
lazr.restfulclient==0.11.2
lazr.uri==1.0.2
oauth==1.0.1
paramiko==1.7.7.1
pexpect==2.3
pyOpenSSL==0.12
pycrypto==2.3
pycurl==7.19.0
pymongo==2.3
pyserial==2.5
python-apt==0.8.0ubuntu9
python-debian==0.1.20ubuntu2
simplejson==2.1.6
smart==1.4
stripogram==1.5
ufw==0.30.1-2ubuntu1
unattended-upgrades==0.1
wadllib==1.2.0
wsgiref==0.1.2
zope.interface==3.6.1
kyle@db-stage-1:~$


Attachments: Text File case3-log.txt     Text File case3-shell.txt    
Operating System: Linux
Participants:

 Description   

I am finding data like the data shown below in my system.profile collection for various databases. I've tried disabling profiling, clearing the collection, and re-enabling, at which point the errors occur again after a few minutes.

This is affecting our ability to analyze the profile collection,, and can't be a good thing.

I created this bug with the creator of dex and he suggested I open a bug with the mongo team: https://github.com/mongolab/dex/issues/4

Thu Sep 20 13:50:45 Assertion: 10320:BSONElement: bad type 121
0x511fa1 0x573f0b 0x4b0cc8 0x7160d6 0x516d05 0x5197fc 0x6af7ca 0x6be534 0x52b51e 0x6bdf6c 0x6af9e5 0x6bdaff 0x584c26 0x584ca2 0x5860ae 0x71e6d4 0x4a46e5 0x4a6726 0x7f569b45030d 0x49f5a9 
mongo(ZN5mongo15printStackTraceERSo+0x21) [0x511fa1]
mongo(_ZN5mongo11msgassertedEiPKc+0x9b) [0x573f0b]
mongo(_ZNK5mongo11BSONElement4sizeEv+0x1d8) [0x4b0cc8]
mongo(_ZN5mongo16resolveBSONFieldEP9JSContextP8JSObjectljPS3+0x146) [0x7160d6]
mongo(jsLookupPropertyWithFlags+0x3f5) [0x516d05]
mongo(js_GetProperty+0x7c) [0x5197fc]
mongo(js_Interpret+0x10ea) [0x6af7ca]
mongo(js_Invoke+0x9d4) [0x6be534]
mongo() [0x52b51e]
mongo(js_Invoke+0x40c) [0x6bdf6c]
mongo(js_Interpret+0x1305) [0x6af9e5]
mongo(js_Execute+0x36f) [0x6bdaff]
mongo(JS_EvaluateUCScriptForPrincipals+0x66) [0x584c26]
mongo(JS_EvaluateUCScript+0x22) [0x584ca2]
mongo(JS_EvaluateScript+0x6e) [0x5860ae]
mongo(_ZN5mongo7SMScope4execERKNS_10StringDataERKSsbbbi+0x144) [0x71e6d4]
mongo(_Z5_mainiPPc+0x2455) [0x4a46e5]
mongo(main+0x26) [0x4a6726]
/lib/x86_64-linux-gnu/libc.so.6(libc_start_main+0xed) [0x7f569b45030d]
mongo(_gxx_personality_v0+0x2a1) [0x49f5a9]
Error: BSONElement: bad type 121

Fri Sep 21 12:02:39 Assertion: 10320:BSONElement: bad type -116
0x511fa1 0x573f0b 0x4b0cc8 0x7160d6 0x516d05 0x5197fc 0x6af7ca 0x6bdaff 0x584c26 0x584ca2 0x5860ae 0x71e6d4 0x4a4954 0x4a6726 0x7f9131be630d 0x49f5a9 
 mongo(_ZN5mongo15printStackTraceERSo+0x21) [0x511fa1]
 mongo(_ZN5mongo11msgassertedEiPKc+0x9b) [0x573f0b]
 mongo(_ZNK5mongo11BSONElement4sizeEv+0x1d8) [0x4b0cc8]
 mongo(_ZN5mongo16resolveBSONFieldEP9JSContextP8JSObjectljPS3_+0x146) [0x7160d6]
 mongo(js_LookupPropertyWithFlags+0x3f5) [0x516d05]
 mongo(js_GetProperty+0x7c) [0x5197fc]
 mongo(js_Interpret+0x10ea) [0x6af7ca]
 mongo(js_Execute+0x36f) [0x6bdaff]
 mongo(JS_EvaluateUCScriptForPrincipals+0x66) [0x584c26]
 mongo(JS_EvaluateUCScript+0x22) [0x584ca2]
 mongo(JS_EvaluateScript+0x6e) [0x5860ae]
 mongo(_ZN5mongo7SMScope4execERKNS_10StringDataERKSsbbbi+0x144) [0x71e6d4]
 mongo(_Z5_mainiPPc+0x26c4) [0x4a4954]
 mongo(main+0x26) [0x4a6726]
 /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xed) [0x7f9131be630d]
 mongo(__gxx_personality_v0+0x2a1) [0x49f5a9]
Error: BSONElement: bad type -116



 Comments   
Comment by Eric Sedor [ 13/Dec/12 ]

Ah great! Thank you so much for the pointer. Cheers!

Comment by Eliot Horowitz (Inactive) [ 13/Dec/12 ]

Sorry for not marking, but we found the root issue: SERVER-7206
This is fixed in 2.2.2

Comment by Eric Sedor [ 13/Dec/12 ]

Hi Tad,

It's slow-going on this end, too, especially without having reproduced directly. We had a third report of this issue and the user was able to provide the db.stats and validate(true) output.

Hopefully this helps!

Sincerely,
Eric@MongoLab

Comment by Tad Marshall [ 09/Oct/12 ]

Hi Kyle and Eric,

Sorry for the delay in following up with you.

The validate(true) output says that you have 88 invalid documents and that the details can be found in the log. This is a little clumsy, but validate(true) doesn't have an easy way to capture the assertions that are generated by trying to read the invalid documents, so it just counts the events and refers you to the log to get the details.

Can you repeat the validate(true) and post the results along with the fragment of the mongod log file that shows the errors generated?

Could you also post the results of "db.system.profile.stats()"?

It might also be helpful to get a mongodump of the profiles collection that is giving you difficulty or, if that doesn't work, a copy of the database file itself.

We can start with the log of the validate(true) and see if that is enough to identify the problem.

How large is the database that you are profiling? If you can post the output of db.stats(), should contain the relevant information.

Tad

Comment by Eric Sedor [ 09/Oct/12 ]

Hi Tad, I developed Dex so I wanted to pop in and offer a stacktrace that was provided to me for this issue.

Traceback (most recent call last):
File "/usr/local/bin/dex", line 147, in <module>
main(sys.argv[1:])
File "/usr/local/bin/dex", line 127, in main
return md.analyze_profile()
File "/usr/local/lib/python2.7/dist-packages/dex/dex.py", line 150, in analyze_profile
for profile_entry in profile_entries:
File "/usr/local/lib/python2.7/dist-packages/pymongo/cursor.py", line 778, in next
if len(self.__data) or self._refresh():
File "/usr/local/lib/python2.7/dist-packages/pymongo/cursor.py", line 729, in _refresh
self.__uuid_subtype))
File "/usr/local/lib/python2.7/dist-packages/pymongo/cursor.py", line 686, in __send_message
self.__uuid_subtype)
File "/usr/local/lib/python2.7/dist-packages/pymongo/helpers.py", line 111, in _unpack_response
as_class, tz_aware, uuid_subtype)
InvalidDocument: no c decoder for this type yet

The operative code in Dex is very basic, here it is, abbreviated but nearly verbatim:

profile_entries = connection[database]['system.profile'].find()
for profile_entry in profile_entries:
#do read-only stuff

Full code is at https://github.com/mongolab/dex/

Thanks!

Comment by Kyle Gato [ 21/Sep/12 ]

Dex just analyzes the bson data, so when non bson data is returned it does not know how to understand it. No map reduces or eval's were run against the profile collection. This type of data reoccurs after clearing out the profile collection within minutes.

Here is the validate of the profile collection:

db-stage-sh1:PRIMARY> db.system.profile.validate()
{
	"ns" : "staging-kiwi.system.profile",
	"capped" : true,
	"max" : 2147483647,
	"firstExtent" : "0:23a7000 ns:staging-kiwi.system.profile",
	"lastExtent" : "0:23a7000 ns:staging-kiwi.system.profile",
	"extentCount" : 1,
	"datasize" : 1011960,
	"nrecords" : 2510,
	"lastExtentSize" : 1052672,
	"padding" : 1,
	"firstExtentDetails" : {
		"loc" : "0:23a7000",
		"xnext" : "null",
		"xprev" : "null",
		"nsdiag" : "staging-kiwi.system.profile",
		"size" : 1052672,
		"firstRecord" : "0:246b6a4",
		"lastRecord" : "0:246b42c"
	},
	"deletedCount" : 2,
	"deletedSize" : 376,
	"nIndexes" : 0,
	"keysPerIndex" : {
		
	},
	"valid" : true,
	"errors" : [ ],
	"warning" : "Some checks omitted for speed. use {full:true} option to do more thorough scan.",
	"ok" : 1
}

Here is a full validate:

db-stage-sh1:PRIMARY> db.system.profile.validate({full: true})
{
	"ns" : "staging-kiwi.system.profile",
	"capped" : true,
	"max" : 2147483647,
	"firstExtent" : "0:23a7000 ns:staging-kiwi.system.profile",
	"lastExtent" : "0:23a7000 ns:staging-kiwi.system.profile",
	"extentCount" : 1,
	"extents" : [
		{
			"loc" : "0:23a7000",
			"xnext" : "null",
			"xprev" : "null",
			"nsdiag" : "staging-kiwi.system.profile",
			"size" : 1052672,
			"firstRecord" : "0:246b81c",
			"lastRecord" : "0:246b580"
		}
	],
	"datasize" : 1011928,
	"nrecords" : 2510,
	"lastExtentSize" : 1052672,
	"padding" : 1,
	"firstExtentDetails" : {
		"loc" : "0:23a7000",
		"xnext" : "null",
		"xprev" : "null",
		"nsdiag" : "staging-kiwi.system.profile",
		"size" : 1052672,
		"firstRecord" : "0:246b81c",
		"lastRecord" : "0:246b580"
	},
	"objectsFound" : 2510,
	"invalidObjects" : 88,
	"bytesWithHeaders" : 1052088,
	"bytesWithoutHeaders" : 1011928,
	"deletedCount" : 2,
	"deletedSize" : 408,
	"nIndexes" : 0,
	"keysPerIndex" : {
		
	},
	"valid" : false,
	"errors" : [
		"invalid bson object detected (see logs for more info)"
	],
	"advice" : "ns corrupt, requires repair",
	"ok" : 1
}

Comment by Tad Marshall [ 21/Sep/12 ]

Can you post the output of running validate on this collection (in the mongo shell)? See http://docs.mongodb.org/manual/reference/commands/#validate .

Your stack trace shows several SpiderMonkey (JavaScript) functions; were you running map-reduce or eval against the profile collection?

Sorry, I'm not familiar with Dex. Do you know what it is doing when it gets these errors?

Generated at Thu Feb 08 03:13:39 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.