[SERVER-4345] mongorestore has no way to accept input from stdin Created: 21/Nov/11  Updated: 08/Jul/14  Resolved: 02/Jul/14

Status: Closed
Project: Core Server
Component/s: Tools
Affects Version/s: 2.0.1
Fix Version/s: 2.7.3

Type: Bug Priority: Major - P3
Reporter: Adam Fields Assignee: Benety Goh
Resolution: Done Votes: 33
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Duplicate
is duplicated by TOOLS-133 Add option to mongodump, mongorestore... Closed
is duplicated by SERVER-3111 Allow mongorestore to read from stdin Closed
Related
is related to TOOLS-23 mongodump to allow compression or dum... Closed
Tested
Operating System: ALL
Participants:

 Description   

mongodump can output to stdout, but mongorestore appears to have no way to accept input from stdin. This makes it very difficult to pipe the output from mongodump to mongorestore to copy a collection from one server to another without writing the entire collection to a temporary file (potentially hundreds of GB).



 Comments   
Comment by Githook User [ 02/Jul/14 ]

Author:

{u'username': u'benety', u'name': u'Benety Goh', u'email': u'benety@mongodb.com'}

Message: SERVER-4345 mongorestore will read from stdin when filename is "-"
check eof only on fifo files. restored original logic of checking bytes read against file length.
mongorestore requires both --db and --collection when using stdin/fifo
Branch: master
https://github.com/mongodb/mongo/commit/8ada5660746aa5aa351cc36e8d793cbc353c4fad

Comment by Githook User [ 02/Jul/14 ]

Author:

{u'username': u'azat', u'name': u'Azat Khuzhin', u'email': u'a3at.mail@gmail.com'}

Message: SERVER-4345 Add support processing fifo files in mongorestore

Example : `./mongorestore -d test -c foo_dupl <(mongodump -d test -c foo -out - 2>/dev/null | tail -n+2)`
Example (after pull #204) : `./mongorestore -d test -c foo_dupl <(mongodump -d test -c foo -out - 2>/dev/null)`

This closes #205

Signed-off-by: Benety Goh <benety@mongodb.com>
Branch: master
https://github.com/mongodb/mongo/commit/962dae30460ca6c2a9115b2d950cc8f2f6120d54

Comment by Mike Erickson [ 31/Jan/14 ]

Additionally, the following doesn't work but should,

mongorestore -h host -d db -c col <(zcat col.bson.gz)

It fails immediately with "don't know what to do with [/dev/fd/XX]"

Comment by Diego Carrión [ 14/Oct/13 ]

I just want to cast my vote for this as well. I am also leveraging mongodump's stdout capabilities to upload the dump to a backup server, and would like to be able to restore with this same workflow so as to avoid doubling the storage requirements.
Since the functionality to dump to stdout exists in mongodump, I believe the reverse should be possible with mongorestore.

tl;dr = +1

Comment by Kjetil Flovild-Midtlie [ 10/Oct/13 ]

vote vote vote

Comment by Nathan Peck [ 02/Aug/13 ]

Adding my vote to this feature request. Like Mike Fellows I am piping the output of mongodump through Gzip. I am using a Node.js environment which is built from the ground up for asynchronous streams so it is a natural process to pipe the output of mongodump for a collection through gzip and client side encryption, then streaming the compressed and encrypted collection dump into a private Amazon S3 backup bucket.

Now I'm working on the reverse flow, with the file data downloading from the backup bucket on S3, decrypting, decompressing, but I can't pipe the dump back into mongorestore, and instead have no other option but to write it to disk and execute mongorestore on the disk file.

This is unacceptable, especially for large dumps because it results in restoring using up gigabytes of disk space, causing significant wear and tear on our hard drives, and being very slow.

Comment by Thorn Roby [ 11/Apr/13 ]

I would also find this useful. And copyCollection won't work with mongos.

Comment by David Kellum [ 14/Sep/12 ]

I have the same use case for a fix to this as Mr Fellows above. In fact it makes restores much more difficult as the uncompressed dump would not even fit on a normal single host partition in our setup. Please (re-)consider adding restore from stdout in the single collection case.

Comment by Mike Fellows [ 14/Sep/12 ]

We have a use case that would be helped by this feature. We are backing up single collections using mongodump and piping the output to gzip. On those occasions where a restore is needed we access the gzipped backup file and are currently required to ungzip the entire file, then use mongorestore to pull the collection in. Our collections are quite large (many GBytes) and handling the uncompressed files can be inconvenient. Streaming the file out of gunzip and directly into mongorestore would be more convenient.

I would envision something like this (the mongodump step currently works fine):

mongodump --db my_database --collection my_collection --out - | gzip > my_collection.bson.gz
gunzip -c my_collection.bson.gz | mongorestore --db my_database --collection my_collection

I understand that the larger issue is more complicated as both dump and restore are designed to work at the database level with multiple collections (and now indexes?) but thought I would mention this use case as well.

Comment by Will [ 11/Sep/12 ]

As part of my setup scripts for our mongo powered appliance, I too would like to be able to pipe from an imaging server over ssh to mongorestore

Comment by Tim [ 06/Sep/12 ]

Ran into the same problem.

It's a very powerful thing to be able to sync a collection from production, over ssh, to a development machine (as already pointed out by Sergei Serdyuk).

On a more general note.
It would be really nice if the commandline tools would adhere to standard unix philosophies like piping, using stdout and stderr in the right cases and reading configs (like passwords) from a configuration file.

Currently I miss this power. Mongoexport is another example of a tool that clutters stdout with informational messages (leading to at least one error when piped to mongoimport) and is unable to read authentication credentials from a configuration file.

Comment by Sergei Serdyuk [ 06/Feb/12 ]

While it is true that it is somewhat limited to only one collection per run, using stdin would allow piping through ssh from a remote server. I just found this page because I've tried to pipe data from one server to another without exposing mongodb on a public ip address. Piping over ssh is a very common practice.

Comment by Scott Roberts [ 04/Feb/12 ]

Oops, thanks for pointing that out Scott. With this and copyDatabase being available, I guess I don't see any major additional benefits of being able to pipe mongodump to mongorestore.

Comment by Scott Hernandez (Inactive) [ 04/Feb/12 ]

There is already a cloneCollection command to copy a collection to another server: http://www.mongodb.org/display/DOCS/cloneCollection+Command

Comment by Scott Roberts [ 04/Feb/12 ]

I agree this would be a nice feature, would take out an intermediary step. An alternative that may be more palatable to you guys is adding the ability to specify a single collection on the copyDatabase command. That would essentially allow the same behavior that is being requested here as well.

Comment by Adam Fields [ 22/Nov/11 ]

Also, if there are debug/status messages being output to stdout instead of stderr, doesn't that mean that the mongodump "output to stdout" option isn't working as it should be?

Comment by Adam Fields [ 22/Nov/11 ]

Aren't those just additional aspects to the bug that should be part of the fix? It seems like this is right on the verge of being really useful for archiving/migrating potentially very large collections.

Comment by Scott Hernandez (Inactive) [ 21/Nov/11 ]

This is not all that useful for more than a single collection; there is no binary delineator yet for between collections/dbs. There are also issues with debug/status text being output to stdOut.

Generated at Thu Feb 08 03:05:43 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.