[SERVER-2536] mongodump causes exception on mongod Created: 11/Feb/11  Updated: 30/Mar/12  Resolved: 14/Jun/11

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 1.7.5
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Reinaldo Giudici Assignee: Aaron Staple
Resolution: Duplicate Votes: 1
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Operating System: ALL
Participants:

 Description   

When running a mongodump on all members of a replica set, I see this error on the mongod server (all of them).
I run mongodump against the primary and both secondaries and I see the same error on all.

I ran a db.repairDatabase() and still getting the same.

This is new since 1.7.5 upgrade from 1.6.5

Fri Feb 11 14:39:11 [initandlisten] connection accepted from 10.12.57.162:37054 #56322
Fri Feb 11 14:39:11 [conn51176] warning: no id index on $snapshot query, ns:094410ee3c5d4757be2885586235f19d.LEADEBOARD_sp_dev_chkall1_d_1294898400000.$_id
Fri Feb 11 14:39:11 [conn51176] Assertion: 10334:Invalid BSONObj size: -1 (0xFFFFFFFF) first element: EOO
0x560819 0x4ed88e 0x780128 0x65a67b 0x788566 0x791091 0x794815 0x796d44 0x7978e7 0x643c7d 0x74dc0c 0x75399d 0x89a15e 0x8ad550 0x7f6b3fb839ca 0x7f6b3f13270d
/usr/bin/mongod(_ZN5mongo11msgassertedEiPKc+0x129) [0x560819]
/usr/bin/mongod(_ZNK5mongo7BSONObj14_assertInvalidEv+0x46e) [0x4ed88e]
/usr/bin/mongod(_ZN5mongo11BasicCursor7currentEv+0x68) [0x780128]
/usr/bin/mongod(_ZN5mongo11UserQueryOp4nextEv+0x7cb) [0x65a67b]
/usr/bin/mongod(_ZN5mongo12QueryPlanSet6Runner6nextOpERNS_7QueryOpE+0x56) [0x788566]
/usr/bin/mongod(_ZN5mongo12QueryPlanSet6Runner3runEv+0x861) [0x791091]
/usr/bin/mongod(_ZN5mongo12QueryPlanSet5runOpERNS_7QueryOpE+0x255) [0x794815]
/usr/bin/mongod(_ZN5mongo16MultiPlanScanner9runOpOnceERNS_7QueryOpE+0x64) [0x796d44]
/usr/bin/mongod(_ZN5mongo16MultiPlanScanner5runOpERNS_7QueryOpE+0x17) [0x7978e7]
/usr/bin/mongod(ZN5mongo8runQueryERNS_7MessageERNS_12QueryMessageERNS_5CurOpES1+0xdfd) [0x643c7d]
/usr/bin/mongod() [0x74dc0c]
/usr/bin/mongod(_ZN5mongo16assembleResponseERNS_7MessageERNS_10DbResponseERKNS_8SockAddrE+0x5ad) [0x75399d]
/usr/bin/mongod(_ZN5mongo10connThreadEPNS_13MessagingPortE+0x21e) [0x89a15e]
/usr/bin/mongod(thread_proxy+0x80) [0x8ad550]
/lib/libpthread.so.0(+0x69ca) [0x7f6b3fb839ca]
/lib/libc.so.6(clone+0x6d) [0x7f6b3f13270d]
Fri Feb 11 14:39:11 [conn51176] assertion 10334 Invalid BSONObj size: -1 (0xFFFFFFFF) first element: EOO ns:094410ee3c5d4757be2885586235f19d.LEADEBOARD_sp_dev_chkall1_d_1294898400000.$id query:{ query: {}, $snapshot: true }
Fri Feb 11 14:39:11 [conn51176] end connection 10.12.57.162:37054



 Comments   
Comment by Aaron Staple [ 14/Jun/11 ]

sounds like it's probably SERVER-2973

Comment by Bryon Ross [ 02/Jun/11 ]

It seems like this issue may be related to SERVER-2973. Both appear to involve mongodump, the warning "no _id index on $snapshot query", and namespaces greater than 79 characters.

Comment by Emrys Ingersoll [ 14/Apr/11 ]

I think I am also running into this issue. When running mongodump against my database, I get the following error

vimana-cwshelby.utilization.devices.d4d7c5fce8c65cdabd01fbde4.cycletimeline.$id to dump/vimana-cwshelby/utilization.devices.d4d7c5fce8c65cdabd01fbde4.cycletimeline.$id.bson
assertion: 13106 nextSafe():

{ $err: "Invalid BSONObj size: 0 (0x00000000) first element: ;9: ?type=-112", code: 10334 }

here is the mongod log output during this operation

Thu Apr 14 17:28:48 [initandlisten] MongoDB starting : pid=90208 port=27017 dbpath=/usr/local/var/mongodb master=1 64-bit
Thu Apr 14 17:28:48 [initandlisten] db version v1.8.1, pdfile version 4.5
Thu Apr 14 17:28:48 [initandlisten] git version: a429cd4f535b2499cc4130b06ff7c26f41c00f04
Thu Apr 14 17:28:48 [initandlisten] build sys info: Darwin erh2.10gen.cc 9.6.0 Darwin Kernel Version 9.6.0: Mon Nov 24 17:37:00 PST 2008; root:xnu-1228.9.59~1/RELEASE_I386 i386 BOOST_LIB_VERSION=1_40
Thu Apr 14 17:28:48 [initandlisten] journal dir=/usr/local/var/mongodb/journal
Thu Apr 14 17:28:48 [initandlisten] recover : no journal files present, no recovery needed
Thu Apr 14 17:28:48 [initandlisten] waiting for connections on port 27017
Thu Apr 14 17:29:21 [initandlisten] connection accepted from 127.0.0.1:56825 #1
Thu Apr 14 17:29:21 [conn1] warning: no id index on $snapshot query, ns:vimana-cwshelby.utilization.devices.d4d7c5fce8c65cdabd01fbde4.cycletimeline.$_id
Thu Apr 14 17:29:21 [conn1] Assertion: 10334:Invalid BSONObj size: 0 (0x00000000) first element: ;9: ?type=-112
0x10008de9b 0x10000d3f3 0x1002f1788 0x10019c65b 0x1002f9476 0x100300cc8 0x100307964 0x100309ab1 0x10030a99a 0x100182c78 0x1002b9e89 0x1002c3f18 0x100433888 0x100446d74 0x7fff8094a4f6 0x7fff8094a3a9
0 mongod 0x000000010008de9b _ZN5mongo11msgassertedEiPKc + 315
1 mongod 0x000000010000d3f3 _ZNK5mongo7BSONObj14_assertInvalidEv + 1059
2 mongod 0x00000001002f1788 _ZN5mongo11BasicCursor7currentEv + 104
3 mongod 0x000000010019c65b _ZN5mongo11UserQueryOp4nextEv + 5867
4 mongod 0x00000001002f9476 _ZN5mongo12QueryPlanSet6Runner6nextOpERNS_7QueryOpE + 70
5 mongod 0x0000000100300cc8 _ZN5mongo12QueryPlanSet6Runner3runEv + 1720
6 mongod 0x0000000100307964 _ZN5mongo12QueryPlanSet5runOpERNS_7QueryOpE + 868
7 mongod 0x0000000100309ab1 _ZN5mongo16MultiPlanScanner9runOpOnceERNS_7QueryOpE + 113
8 mongod 0x000000010030a99a _ZN5mongo16MultiPlanScanner5runOpERNS_7QueryOpE + 26
9 mongod 0x0000000100182c78 ZN5mongo8runQueryERNS_7MessageERNS_12QueryMessageERNS_5CurOpES1 + 3976
10 mongod 0x00000001002b9e89 _ZN5mongo13receivedQueryERNS_6ClientERNS_10DbResponseERNS_7MessageE + 569
11 mongod 0x00000001002c3f18 _ZN5mongo16assembleResponseERNS_7MessageERNS_10DbResponseERKNS_8SockAddrE + 1528
12 mongod 0x0000000100433888 _ZN5mongo10connThreadEPNS_13MessagingPortE + 616
13 mongod 0x0000000100446d74 thread_proxy + 132
14 libSystem.B.dylib 0x00007fff8094a4f6 _pthread_start + 331
15 libSystem.B.dylib 0x00007fff8094a3a9 thread_start + 13
Thu Apr 14 17:29:21 [conn1] assertion 10334 Invalid BSONObj size: 0 (0x00000000) first element: ;9: ?type=-112 ns:vimana-cwshelby.utilization.devices.d4d7c5fce8c65cdabd01fbde4.cycletimeline.$id query:{ query: {}, $snapshot: true }
Thu Apr 14 17:29:21 [conn1] end connection 127.0.0.1:56825
Thu Apr 14 17:29:49 [dur] lsn set 59381

These errors were from a Mongo 1.8.1 server, the application server that originally created the database is running 1.8.0, I see the same error when running mongodump on it as well.

The app server is running on Ubuntu 10.04.1 LTS using the following kernel

2.6.33.5-rscloud #2 SMP Thu Jun 10 15:26:23 UTC 2010 x86_64 GNU/Linux

filesystem is ext3

the java application writing to the database is using the mongo java driver, version 2.5

I have successfully ran mongodb with --repair on this database, but it has no effect and the aforementioned error persists.

Comment by Aaron Staple [ 24/Feb/11 ]

Hi Reinaldo -

What driver are you using? (Please include version #)
for the mongodump, no driver, is a command line call. our API servers are using the java driver.
The app servers are what I was wondering about - just to make sure, is it a recent version of the java driver?

Have you tried running with --objcheck mode?
Nope, I'll try that.
This is something you could run to see if any driver code attempts to insert an invalid document. (Normally it is the driver's responsibility to ensure documents are correct and mongod doesn't validate them itself.)

Did you ever have a hard crash on one of these servers? This may have occurred for example before you ran your first repair.
Yes, we had lots of crashes on all of them, seems mostly due to the drop collection
Have you seen crashes from SERVER-2508? I think in the logs that were sent there were a number of assertions but none of them caused the mongod process to stop running suddenly, which is what I'm referring to when I say crash.
It is unexpected that a repair operation resulting in success would leave bad documents behind. Would it be possible to send the data files where this behavior was observed?

Comment by Reinaldo Giudici [ 23/Feb/11 ]

Hi Reinaldo -

What driver are you using? (Please include version #)
for the mongodump, no driver, is a command line call. our API servers are using the java driver.

Have you tried running with --objcheck mode?
Nope, I'll try that.

When you did db.repairDatabase() are you sure you were 'using' the proper database? Have you tried using the --repair command line option instead?
I did, I connected tot he primary that was showing the error, and I ran repairDatabase there, the result was somethinng like

{ok : 1}

I think.

Is this issue occurring in conjunction with SERVER-2508 or independently?
Not 100% sure. We have not gotten SERVER-2508, since we stopped dropping collections, but we still get this error. might be that the db is bad since those were

Did you ever have a hard crash on one of these servers? This may have occurred for example before you ran your first repair.
Yes, we had lots of crashes on all of them, seems mostly due to the drop collection

Comment by Aaron Staple [ 23/Feb/11 ]

Hi Reinaldo -

What driver are you using? (Please include version #)
Have you tried running with --objcheck mode?
When you did db.repairDatabase() are you sure you were 'using' the proper database? Have you tried using the --repair command line option instead?
Is this issue occurring in conjunction with SERVER-2508 or independently?
Did you ever have a hard crash on one of these servers? This may have occurred for example before you ran your first repair.

Comment by Reinaldo Giudici [ 17/Feb/11 ]

tried again on the latest 1.7.5 and after some changes on our app.

I get this on the mongodump side:
094410ee3c5d4757be2885586235f19d.LEADEBOARD_sp_dev_chkall_l to cloud-mongo11/094410ee3c5d4757be2885586235f19d/LEADEBOARD_sp_dev_chkall_l.bson
4 objects
094410ee3c5d4757be2885586235f19d.LEADEBOARD_sp_dev_chkall1_d_1294898400000 to cloud-mongo11/094410ee3c5d4757be2885586235f19d/LEADEBOARD_sp_dev_chkall1_d_1294898400000.bson
2 objects
094410ee3c5d4757be2885586235f19d.LEADEBOARD_sp_dev_chkall1_d_1294898400000.$id to cloud-mongo11/094410ee3c5d4757be2885586235f19d/LEADEBOARD_sp_dev_chkall1_d_1294898400000.$id.bson
assertion: 13106 nextSafe():

{ $err: "Invalid BSONObj size: -1 (0xFFFFFFFF) first element: EOO", code: 10334 }

Wed Feb 16 19:01:36 MessagingPort flush send() errno:9 Bad file descriptor 10.36.58.22:27017
Wed Feb 16 19:01:36 caught exception (socket exception) in destructor (~PiggyBackData)

and this on the mongod side:

3c5d4757be2885586235f19d.LEADEBOARD_sb_local_level_l
Wed Feb 16 19:01:24 [conn66568] done for 86 records 0secs
Wed Feb 16 19:01:24 [conn66568] building new index on

{ value: -1 }

for 094410ee3c5d4757be2885586235f19d.LEADEBOARD_sb_dev2_totalMoney_l
Wed Feb 16 19:01:24 [conn66568] done for 69 records 0secs
Wed Feb 16 19:01:24 [conn66568] building new index on

{ value: -1 }

for 094410ee3c5d4757be2885586235f19d.LEADEBOARD_sb_dev2_level_l
Wed Feb 16 19:01:24 [conn66568] done for 60 records 0secs
Wed Feb 16 19:01:24 [conn66568] building new index on

{ value: -1 }

for 094410ee3c5d4757be2885586235f19d.LEADEBOARD_sb_dev2_picksMade_l
Wed Feb 16 19:01:24 [conn66568] done for 3 records 0secs
Wed Feb 16 19:01:24 [conn66568] building new index on

{ value: -1 }

for 094410ee3c5d4757be2885586235f19d.LEADEBOARD_sb_dev2_buzzPoints_l
Wed Feb 16 19:01:24 [conn66568] done for 34 records 0secs
Wed Feb 16 19:01:24 [conn66568] building new index on

{ value: -1 }

for 094410ee3c5d4757be2885586235f19d.LEADEBOARD_sb_local_redeemedCollections_l
Wed Feb 16 19:01:24 [conn66568] done for 2 records 0secs
Wed Feb 16 19:01:24 [conn66568] building new index on

{ value: -1 }

for 094410ee3c5d4757be2885586235f19d.LEADEBOARD_sb_qa2_overall_l
Wed Feb 16 19:01:24 [conn66568] done for 0 records 0secs
Wed Feb 16 19:01:24 [conn66568] building new index on

{ value: -1 }

for 094410ee3c5d4757be2885586235f19d.LEADEBOARD_sb_qa2_picksWin_l
Wed Feb 16 19:01:24 [conn66568] done for 0 records 0secs
Wed Feb 16 19:01:24 [conn66568] building new index on

{ value: -1 }

for 094410ee3c5d4757be2885586235f19d.LEADEBOARD_sb_qa2_buzzPoints:tennis:atp:241_l
Wed Feb 16 19:01:24 [conn66568] done for 1 records 0secs
Wed Feb 16 19:01:24 [conn66568] query 094410ee3c5d4757be2885586235f19d.$cmd ntoreturn:1 command:

{ repairDatabase: 1.0 }

reslen:53 445ms
Wed Feb 16 19:01:36 [conn66569] warning: no id index on $snapshot query, ns:094410ee3c5d4757be2885586235f19d.LEADEBOARD_sp_dev_chkall1_d_1294898400000.$_id
Wed Feb 16 19:01:36 [conn66569] Assertion: 10334:Invalid BSONObj size: -1 (0xFFFFFFFF) first element: EOO
0x560819 0x4ed88e 0x780128 0x65a67b 0x788566 0x791091 0x794815 0x796d44 0x7978e7 0x643c7d 0x74dc0c 0x75399d 0x89a15e 0x8ad550 0x7f6b3fb839ca 0x7f6b3f13270d
/usr/bin/mongod(_ZN5mongo11msgassertedEiPKc+0x129) [0x560819]
/usr/bin/mongod(_ZNK5mongo7BSONObj14_assertInvalidEv+0x46e) [0x4ed88e]
/usr/bin/mongod(_ZN5mongo11BasicCursor7currentEv+0x68) [0x780128]
/usr/bin/mongod(_ZN5mongo11UserQueryOp4nextEv+0x7cb) [0x65a67b]
/usr/bin/mongod(_ZN5mongo12QueryPlanSet6Runner6nextOpERNS_7QueryOpE+0x56) [0x788566]
/usr/bin/mongod(_ZN5mongo12QueryPlanSet6Runner3runEv+0x861) [0x791091]
/usr/bin/mongod(_ZN5mongo12QueryPlanSet5runOpERNS_7QueryOpE+0x255) [0x794815]
/usr/bin/mongod(_ZN5mongo16MultiPlanScanner9runOpOnceERNS_7QueryOpE+0x64) [0x796d44]
/usr/bin/mongod(_ZN5mongo16MultiPlanScanner5runOpERNS_7QueryOpE+0x17) [0x7978e7]
/usr/bin/mongod(ZN5mongo8runQueryERNS_7MessageERNS_12QueryMessageERNS_5CurOpES1+0xdfd) [0x643c7d]
/usr/bin/mongod() [0x74dc0c]
/usr/bin/mongod(_ZN5mongo16assembleResponseERNS_7MessageERNS_10DbResponseERKNS_8SockAddrE+0x5ad) [0x75399d]
/usr/bin/mongod(_ZN5mongo10connThreadEPNS_13MessagingPortE+0x21e) [0x89a15e]
/usr/bin/mongod(thread_proxy+0x80) [0x8ad550]
/lib/libpthread.so.0(+0x69ca) [0x7f6b3fb839ca]
/lib/libc.so.6(clone+0x6d) [0x7f6b3f13270d]
Wed Feb 16 19:01:36 [conn66569] assertion 10334 Invalid BSONObj size: -1 (0xFFFFFFFF) first element: EOO ns:094410ee3c5d4757be2885586235f19d.LEADEBOARD_sp_dev_chkall1_d_1294898400000.$id query:{ query: {}, $snapshot: true }
Wed Feb 16 19:01:49 [conn50720] query admin.$cmd ntoreturn:1 command:

{ writebacklisten: ObjectId('4d556463fc0b4ee1d700ca1f') }

reslen:60 300000ms
Wed Feb 16 19:02:13 [conn50623] query admin.$cmd ntoreturn:1 command: { writebac
Wed Feb 16 19:03:12 [conn50648] query admin.$cmd ntoreturn:1 command:

{ writebacklisten: ObjectId('4d556388afcbab46ad220ad4') }

reslen:60 300000ms

Generated at Thu Feb 08 03:00:19 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.