[SERVER-6947] db.createCollection creates undefined fields which cause mongorestore to fail Created: 05/Sep/12  Updated: 19/Jan/18  Resolved: 08/Apr/13

Status: Closed
Project: Core Server
Component/s: Admin, Tools
Affects Version/s: 2.2.0, 2.3.1
Fix Version/s: 2.2.5, 2.4.4, 2.5.0

Type: Bug Priority: Major - P3
Reporter: Matt Bailey Assignee: Shaun Verch
Resolution: Done Votes: 7
Labels: neweng
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Ubuntu 12.04 64bit using 10gen repositories


Attachments: Text File after_fix.txt     Text File before_fix.txt     File create_restore_test    
Issue Links:
Depends
depends on SERVER-7104 jsonString has incorrect output on un... Closed
Duplicate
is duplicated by SERVER-8408 mongodump dumps a collection with cap... Closed
Related
related to SERVER-9182 mongorestore won't restore manually e... Closed
related to SERVER-13737 CollectionOptions parser should skip ... Closed
related to SERVER-13968 Report invalid collection options in ... Closed
is related to TOOLS-1699 3.4.5 Server causing crash in extende... Closed
is related to TOOLS-1718 duplicate top level key: create Closed
is related to TOOLS-1934 duplicate top level key: create Accepted
Operating System: ALL
Steps To Reproduce:

> use test
> db.dropDatabase()
> db.createCollection("coll")

$ mongodump

> db.coll.drop()

$ mongorestore

Participants:
Case:

 Description   

The db.createCollection() helper has some optional collection options which get set to undefined if they are not provided. This causes mongodump to output undefined elements which can cause mongorestore to fail.

The documents have the form:

{ "options" : { "create" : "coll", "capped" :

{ "$undefined" : true }

, "size" :

{ "$undefined" : true }

}, "indexes" : [ { "v" : 1, "key" :

{ "_id" : 1 }

, "ns" : "test.coll", "name" : "id" } ] }



 Comments   
Comment by Anton Lopyrev [ 17/Jan/14 ]

If anyone is interested, I was able to fix the files with a few quick commands:

find . -name '*.json' -exec sed -i -e 's/, \"[a-z]*\" : { \"\$undefined\" : [a-z]* }//g' "{}" \;
perl -pi -e 'chomp if eof' *.json

Comment by auto [ 03/Jun/13 ]

Author:

{u'username': u'Zarkantho', u'name': u'Shaun Verch', u'email': u'shaun.verch@10gen.com'}

Message: SERVER-6947 Do not create undefined fields in db.createCollection
Branch: v2.2
https://github.com/mongodb/mongo/commit/ad2cce872c16a6e1bd540d9570b3f40cb763ee5f

Comment by auto [ 18/May/13 ]

Author:

{u'date': u'2013-04-08T21:21:33Z', u'name': u'Shaun Verch', u'email': u'shaun.verch@10gen.com'}

Message: SERVER-6947 Do not create undefined fields in db.createCollection
Branch: v2.4
https://github.com/mongodb/mongo/commit/c741fd006ce28c59ee7dd4f32c79672a62cc6a82

Comment by Shaun Verch [ 08/Apr/13 ]

Recommending a backport so old versions will export files that they can import (see test description in my comment above)

Comment by auto [ 08/Apr/13 ]

Author:

{u'date': u'2013-04-08T21:21:33Z', u'name': u'Shaun Verch', u'email': u'shaun.verch@10gen.com'}

Message: SERVER-6947 Do not create undefined fields in db.createCollection
Branch: master
https://github.com/mongodb/mongo/commit/930893b39d0aaf8db227642d78b54eaa6b697320

Comment by Shaun Verch [ 08/Apr/13 ]

Attaching files I used to test the fix.

To summarize:

Before patch that stops db.createCollection from creating undefined fields in 2.5:

dump restore result
master r2.2.3 FAIL
master master SUCCEED
r2.2.3 r2.2.3 FAIL
r2.2.3 master SUCCEED

After patch that stops db.createCollection from creating undefined fields in 2.5:

dump restore result
master r2.2.3 SUCCEED
master master SUCCEED
r2.2.3 r2.2.3 FAIL
r2.2.3 master SUCCEED

See "steps to reproduce" for how I got these tables.

Comment by J Rassi [ 29/Mar/13 ]

Olivier, you're now running into SERVER-9182. After editing, run the following command to remove the trailing newline from metrics_ulta.metadata.json:

perl -pi -e 'chomp if eof' metrics_ulta.metadata.json

Then, retry your restore.

Comment by Maxence Decrosse [ 21/Mar/13 ]

This issue is well described here https://groups.google.com/forum/#!msg/mongodb-user/DzxWmfS6oSY/xdCCn8wndqUJ.

Comment by Olivier [ 20/Mar/13 ]

my files:

metrics_ulta.metadata.json:

{ "options" : { "create" : "metrics_ulta", "capped" : { "$undefined" : true }, "size" : { "$undefined" : true }, "max" : { "$undefined" : true } }, "indexes" : [ { "v" : 1, "key" : { "_id" : 1 }, "ns" : "col_ulta.metrics_ulta", "name" : "_id_" } ] }

metrics_ulta.merged.metadata.json (not sure if needed/used)

{ "indexes" : [ { "v" : 1, "key" : { "_id" : 1 }, "ns" : "col_ulta.metrics_ulta.merged", "name" : "_id_" } ] }

I'm trying to restore to a new db:

$ ll
total 3.4M
-rw-r----- 1 mgcrea admin 3.3M Mar 20 15:45 metrics_ulta.bson
-rw-r----- 1 mgcrea admin  15K Mar 20 15:45 metrics_ulta.merged.bson
-rw-r----- 1 mgcrea admin  110 Mar 20 15:45 metrics_ulta.merged.metadata.json
-rw-r----- 1 mgcrea admin  249 Mar 20 15:45 metrics_ulta.metadata.json
$ mongorestore --db col_ulta_legacy .
connected to: 127.0.0.1
Wed Mar 20 15:45:48 ./metrics_ulta.bson
Wed Mar 20 15:45:48 	going into namespace [col_ulta_legacy.metrics_ulta]
assertion: 15936 Creating collection col_ulta_legacy.metrics_ulta failed. Errmsg: exception: specify size:<n> when capped is true

After editing the metadata:

$ cat metrics_ulta.metadata.json 
{ "options" : { "create" : "metrics_ulta" }, "indexes" : [ { "v" : 1, "key" : { "_id" : 1 }, "ns" : "col_ulta.metrics_ulta", "name" : "_id_" } ] }
$ mongorestore --db col_ulta_legacy .
connected to: 127.0.0.1
Wed Mar 20 15:48:38 ./metrics_ulta.bson
Wed Mar 20 15:48:38 	going into namespace [col_ulta_legacy.metrics_ulta]
assertion: 15934 JSON object size didn't match file size

Comment by Scott Hernandez (Inactive) [ 20/Mar/13 ]

Olivier, please post your *.metadata.json files so we can see what issue you are having.

Comment by Olivier [ 20/Mar/13 ]

Encountered the same issue ("Errmsg: exception: specify size:<n> when capped is true"), tried to remove the undefined fields as suggested above by Scott, but I then get: "assertion: 15934 JSON object size didn't match file size". Any ideas on how I can restore my backups?

db version v2.2.3, pdfile version 4.5
Wed Mar 20 13:19:34 git version: f570771a5d8a3846eb7586eaffcf4c2f4a96bf08

Comment by Blake Maltby [ 03/Jan/13 ]

Ok thanks Scott, I think I can work around this now and upgrade from the 2.0.x to 2.2.x series which is what I was holding off on.

Comment by Scott Hernandez (Inactive) [ 03/Jan/13 ]

Yes, the shell needs to be fixed as well, not to generate the undefined values. Currently there are no other issues related to putting garbage in the collection metadata, but fixing the validation at the server will done as well, as you suggest.

The server code is resilient to getting true/false values from may types and helpers shield the called from making this mistake. There are always chances for bugs but at the moment this is limited to the tools, where the initial fix will be made.

Please see SERVER-8066 for the shell fix

Comment by Blake Maltby [ 03/Jan/13 ]

But any collection created with a call to db.createCollection() in mongoshell causes this failure whilst any collection created by simply inserting a document doesn't. Therefore there are two paths which create the collection in the database in slightly different ways this means ,ignoring the tools, the database is different depending on how you create the collections. It's this difference that worries me and I think the bug is actually in the createCollection function not the tools (although the tools may need to cope with existing broken databases)

I think the difference is that inserting a document to create the collection creates a collection with capped == false and size == 0 (size being the capped collection size) whilst a call to db.createCollection seems to create a collection with capped and size both undefined.
What I think therefore is if I could set capped to false and size to 0 through some db command then everything would work fine, and it's the fact that capped is undefined rather than false that worries me (what if someone does 'if (capped === false)' somewhere else in the mongo code)

As I say even if the tools are fixed to deal with the undefined values it sounds to me like a database with undefined values is probably in an incorrect state and it's createCollection in mongoshell that needs fixing and some way adding to patch existing bad databases.

Comment by Scott Hernandez (Inactive) [ 03/Jan/13 ]

Blake, the problem is not on the server but in the tools. There is nothing that need to be done to the running instance nor your production database files. Once the tools (mongodump/restore) are fixed then there will be no problem. You can fix the json files manually by simply removing the fields which contain $undefined values like so:

{ "options" : { "create" : "coll", "capped" : { "$undefined" : true }, "size" : { "$undefined" : true } }, "indexes" : [ { "v" : 1, "key" : { "_id" : 1 }, "ns" : "test.coll", "name" : "id" } ] }
//into
{ "options" : { "create" : "coll" }, "indexes" : [ { "v" : 1, "key" : { "_id" : 1 }, "ns" : "test.coll", "name" : "id" } ] }

As Shaun mentioned another option it to create the collection before restoring which will cause the create step to be skipped and $undefined values from being used.

Comment by Blake Maltby [ 03/Jan/13 ]

Out of interest are there any server commands that could be run to 'repair' a database collection which has these undefined fields, it'd be quite nice to fix our production database which has some collections in these invalid states. I'm wary that they could throw up further (and possibly more destructive) bugs. I don't really want to have to take the service down and do a backup/restore to fix.
Would these incorrect fields also get replicated with the incorrect state? or would a new replica added to a replica set create a clean db which I could then switch to as primary?

Thanks,
Blake

Comment by Shaun Verch [ 29/Nov/12 ]

Hi Blake,

Thanks again for the feedback. I'm looking into this now and updated the ticket to reflect the underlying issue.

As a workaround, if you create the empty collection before doing the import you don't get this error.

Thanks,
~Shaun Verch

Comment by Blake Maltby [ 29/Nov/12 ]

This still fails on 2.2.2 with the given test case. The failure is now

assertion: 15936 Creating collection test.coll failed. Errmsg: exception: specify size:<n> when capped is true

Where the coll.metadata.json contains

{ "options" : { "create" : "coll", "capped" :

{ "$undefined" : true }

, "size" :

{ "$undefined" : true }

}, "indexes" : [ { "v" : 1, "key" :

{ "_id" : 1 }

, "ns" : "test.coll", "name" : "id" } ] }

So whilst the json is now valid the values for capped and size (and maybe others) are being written out incorrectly, and mongorestore probably does an if test on the value of capped which is true because it's an object of

{ "$undefined" : true }

rather then null or false.

Blake.

Comment by Shaun Verch [ 26/Nov/12 ]

Hi Blake,

Thanks for the very clear and reproducible test case! It looks like the root cause of this issue is SERVER-7104, which has been fixed in 2.2.2 and 2.3.0.

Thanks!
~Shaun Verch

Comment by Blake Maltby [ 22/Nov/12 ]

As simple repro to create this crash is to start mongo shell and do

use test
db.dropDatabase()
db.createCollection('coll')

Dump that with mongodump and in the coll.metadata.json you get

{options :

{ "create" : "coll1", undefined, undefined }

, indexes:[{ "v" : 1, "key" :

{ "_id" : 1 }

, "ns" : "test.coll1", "name" : "id" }]}

Where clearly that ,undefined, undefined is not valid JSON.
It seems like calling createCollection sets up some invalid options.

You can also trigger this with an index doing

db.coll.createIndex(

{'test':1}

,

{'op1':1, 'op2':undefined}

)

this also puts some undefined values on their own in an object.

Calling mongorestore on either of these fails with the b.empty() assertion.

Comment by Scott Hernandez (Inactive) [ 22/Sep/12 ]

Manish, please upload/attach those metadata.json files so we can look at them. Also, please include the output of the failed mongorestore, including the full error.

Comment by Manish Pandit [ 22/Sep/12 ]

Same thing happening to me with 2.2.0 on Centos 6. Took me 2 hours to upload a dump and bang I get a segfault with this exact trace when trying to restore it. I did not find any empty json or bson in there.

rw-rr- 1 mpandit mpandit-pg 853M Sep 21 22:07 content.daily.bson
rw-rr- 1 mpandit mpandit-pg 306 Sep 21 22:07 content.daily.metadata.json
rw-rr- 1 mpandit mpandit-pg 31M Sep 21 22:12 content.monthly.bson
rw-rr- 1 mpandit mpandit-pg 246 Sep 21 22:12 content.monthly.metadata.json
rw-rr- 1 mpandit mpandit-pg 173 Sep 21 22:12 tokens.bson
rw-rr- 1 mpandit mpandit-pg 171 Sep 21 22:12 tokens.metadata.json

I guess I'll have to revert to export/import and try to not forget building indices.

Comment by Shaun Verch [ 21/Sep/12 ]

Hi Aleksey,

Could you provide the following?

1. A list of all the files in your dump directory that you are trying to restore.
2. The contents of any .json files you find in your dump directory.
3. The sequence of commands you ran when this happened.

Thanks!

Comment by Aleksey Mykhailov [ 21/Sep/12 ]

I faced with the same issue
restore to the mongo 2.0.7 works fine , to the 2.2 - crashes it

Comment by Matt Bailey [ 13/Sep/12 ]

Unfortunately those data are long gone. Trying to reproduce, and the restore just says file <file>.bson empty, skipping. I assumed the failure was due to the failure of b.empty(), perhaps I was wrong.

Comment by Shaun Verch [ 13/Sep/12 ]

Could you post the full contents of the dump directory after you do the mongodump, including the actual contents of any *.json files you find there? Thanks!

Comment by Matt Bailey [ 12/Sep/12 ]

$ mongodump remote-host:port
$ mogorestore ./dump

That's it.

Comment by Shaun Verch [ 11/Sep/12 ]

The error is coming from the JSON parser. mongodump dumps some json files as well as bson files. I was able to get an error from manipulating the json files, but not the one you are encountering here. Could you post the command you used to run mongodump along with the contents of any json files you had in that directory?

Comment by Matt Bailey [ 05/Sep/12 ]

Yes, from a mongodump.

Comment by Scott Hernandez (Inactive) [ 05/Sep/12 ]

How did you get an empty bson file, mongodump?

Comment by Matt Bailey [ 05/Sep/12 ]

(quickfix is to remove the db/collection from the dump)

Generated at Thu Feb 08 03:13:11 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.