Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-14881

Ability to easily save cursor contents to file or collection

    XMLWordPrintableJSON

Details

    • Icon: Improvement Improvement
    • Resolution: Unresolved
    • Icon: Minor - P4 Minor - P4
    • None
    • None
    • JavaScript, Querying, Shell
    • None
    • Query Optimization

    Description

      It's sometimes desirable to take the results of a find() (or something else that returns a cursor, like aggregation) and store the resulting documents somewhere, eg. in some other collection, or in a json or bson file (ala SERVER-12624).

      The idea is that while working interactively in the shell, once you find a query that works well you can save the results (for use by some other tool) quickly and easily by just adding ".saveTo({ns: "db.coll" })" or .saveTo({ file: "output.json" }) to the end of the line, eg:

      db.foo.find( { something: "value" }, { something: 1, interesting: 1 } ).limit(5000).saveTo({ db: "some", collection: "where" })
      db.foo.find( { something: "value" }, { something: 1, interesting: 1 } ).limit(5000).saveTo({ file: "sample.json" })
      

      A naive client-side JS implementation might be something like:

      DBQuery.prototype.saveTo = function(target) {
      	if (target.db || target.collection || target.ns) {
      		if (target.db && target.collection) {
      			t = this._mongo.getDB(target.db).getCollection(target.collection);
      		} else if (target.collection) {
      			t = this._db.getCollection(target.collection);
      		} else if (target.ns) {
      			t = this._mongo.getCollection(target.ns);
      		}
      		while (this.hasNext())
      			t.insert(this.next(), target.options, target._allow_dot);
      	} else if (target.file) {
      		if (target.type === undefined) {
      			if (target.file.endsWith(".json")) {
      				target.type = "json";
      			} else if (target.file.endsWith(".bson")) {
      				target.type = "bson";
      			}
      		}
      		if (target.type == "bson") {
      			// SERVER-12624
      			this.dump(target.file);
      		} else if (target.type == "json") {
      			if (target.pretty) {
      				oneline = target.pretty ? false : true;
      			}
      			if (target.oneline) {
      				oneline = target.oneline ? true : false;
      			}
      			// needs fprint() (SERVER-14880)
      			while (this.hasNext())
      				fprint(target.file, tojson(this.next(), "", oneline));
      			fclose(target.file);
      		}
      	}
      };
      

      This might be good enough to start with. Doing this server-side (to eliminate the network traffic) would be possible by using db.eval with nolock.

      Further improvements might include:

      • using bulk inserts to insert a full cursor batch at a time
      • server-side support for an $out parameter for finds (like $out for Map-Reduce and Aggregation). Instead of returning the cursor to the client, the server would internally iterate over the cursor and insert the results to the specified collection, and returns the status of this procedure. In this case, the shell saveTo() implementation would reduce to calling _addSpecial, and it could appear anywhere in the function chain rather than only at the end. The file-based output would probably remain client-side, though.

      Attachments

        Activity

          People

            backlog-query-optimization Backlog - Query Optimization
            kevin.pulo@mongodb.com Kevin Pulo
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated: