[SERVER-458] JavaScript $function in update Created: 04/Dec/09  Updated: 06/Dec/22  Resolved: 29/Jun/19

Status: Closed
Project: Core Server
Component/s: Write Ops
Affects Version/s: None
Fix Version/s: None

Type: New Feature Priority: Major - P3
Reporter: Mathias Stearn Assignee: Backlog - Query Team (Inactive)
Resolution: Won't Do Votes: 157
Labels: js-in-agg, udf
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Duplicate
is duplicated by SERVER-10213 Allow eval() against a SINGLE documen... Closed
Related
related to SERVER-1765 self referential updates? WAS: allow ... Closed
related to SERVER-40381 Add the ability to specify a pipeline... Closed
related to SERVER-6399 Refactor update() code Closed
related to SERVER-9336 Custom operators as a plug-in into ag... Closed
is related to SERVER-925 support for stored js Closed
is related to SERVER-11345 Allow update to compute expressions u... Closed
Assigned Teams:
Query
Participants:
Case:

 Description   

For maximum flexibility we should allow running arbitrary javascript code on each object during an update. It should probably run after other modifiers.

update({_id: "foo"}. {$inc:

{count: 1, total: 5}

, $function: function()

{this.avg = this.total / this.count}

}



 Comments   
Comment by Asya Kamsky [ 29/Jun/19 ]

SERVER-40381 implemented support for aggregation expressions to specify update.  You can see some examples here.

 

I'm closing this ticket.  Most use cases mentioned can be handled in aggregation using already existing expressions.  In the future we will probably add ability to specify something more customized (see SERVER-9336).

 

 

Comment by zhouao [ 15/Sep/18 ]

这个特性很好,希望能支持

Comment by Kevin Day [ 08/Mar/18 ]

Having this ability, inside just the collection in question, would be a huge benefit. Even if it only allowed operations inside the object that is being updated, it would add significant value. Having to perform client-side iteration to do any sort of complex update is a big issue.

Comment by Asya Kamsky [ 06/Dec/16 ]

For returning just keys, please look at and vote up SERVER-18794 - that should allow returning just keys, or some subset of information using aggregate rather than find.

Comment by Eric Reynolds [ 27/Mar/16 ]

SERVER-10213 is not necessarily a duplicate of this feature, if the function is interpreted as the projection.
My use case is that have some large documents, and I would like my queries to only return a subset of the information. For example if I wanted only the keys I'd like to be able to do this

db.collection.find(
    {...},
    function(d) {
        return Object.keys(d);
    }
)

This is particularly important when the query is being made from another process to avoid sending back lots of unnecessary data.

Comment by ahildoer [ 09/Sep/13 ]

I have to backup Oleg's response to Kevin regarding relational database like functionality.

MongoDB's lack of relational database functionality is a technical decision, not political one against SQL. NoSQL purists don't avoid foreign key constraints (and multi-record atomic operations) just for the sake of being as NoSQL as possible. It is a fundamental shift which achieves performance and horizontal scalability that SQL solutions will never achieve without sacrificing vertical scale.

My vote is -10000 for allowing IO in the $function method. Talking about foreign key enforcement/checking/whatever should be done within the context of the application, not the database server.

Comment by Oleg Rekutin [ 09/Sep/13 ]

Updates only apply to one collection, so you cannot access another collection anyway. There are no joins in MongoDB, so this is definitely out of scope.

The Javascript function should only be updating one record at a time--not only is it a simpler implementation, but it is also cleaner conceptually. It allows the update to continue to be sharded and executed concurrently across shards.

The new V8 engine in 2.4 restricts the ability to run database commands or access other data. See version 2.4 note here: http://docs.mongodb.org/manual/reference/operator/where/#op._S_where . FOr example "db" is not accessible. Therefore, there are already single-record access restrictions built-in.

Comment by Kevin J. Rice [ 09/Sep/13 ]

Moshe Simatov makes a valid point about access control. Certainly having a query that accesses other collections would require some security consideration.

Perhaps, then, we split this case into:

  • Do no other record accesses (this case)
  • Access other records, assuring proper access control is granted.

I'm thinking of a use case where we're verify validity of a foreign key. TableOne:

{ ..., 't2fk1' : 12, ...}

, TableTwo:

{ ..., 'fk1' : 12, ...}

, etc.

This would replicate the SQL query:

  • select t1.id, t1.fk1 from t1 where t1.fk1 not in (select t2.id from t2);

RECOGNIZED OBJECTIONS: okay, so people will complain this isn't supposed to be a relational database, so we shouldn't have foreign keys. But, sometimes this is unavoidable for application-specific reasons. I need to be able to verify my foreign keys are valid.

Comment by Moshe Simantov [ 08/Sep/13 ]

Please consider security issues.
This function should update only the current document and can't access any other documents or collections.

Comment by ahildoer [ 30/Jul/13 ]

+1 Oleg. This is a a great start. Not knowing much of how things are abstracted internally in Mongo, would this also work for findAndModify? If so, this covers our use case entirely.

As for the requirement that the driver support a BSON encoded function, instead of a string. This is an alternative notation that may work for backwards compatibility with drivers that don't support that. Using your example, this alternative syntax would be:

update({}, { $set: { computed: { $func: "return {value: this.a * this.b / this.c}" } } }, { upsert: false, multi: true } );

(Please pardon the edits, I just found the "preview" button)

Comment by Oleg Rekutin [ 30/Jul/13 ]

We have implemented a partial solution of this as a $set that takes a JS function, e.g:

update({}, { $set: { computed: function(){ return {value: this.a * this.b / this.c }} }}, { upsert: false, multi: true});

https://github.com/astral303/mongo/tree/javascript_update (against 2.4.5).

I opened up a discussion on mongodb-dev, it has more details: https://groups.google.com/forum/#!topic/mongodb-dev/xttIIY1kt9k . In short, the syntax is not nailed down, but I'm putting a feeler out for interest. Any feedback welcome!

Comment by ahildoer [ 15/Jul/13 ]

Just adding my vote for this $function supporting JavaScript. Need turing complete language out of the gate on this feature. I seriously doubt that would happen with any aggregation framework using things like $add, $divide, $average, etc.

+1 for SlugFiller's suggestion of the syntax of $func: { $code: function(....

Comment by Ilya Shaisultanov [ 20/Apr/13 ]

Would LOVE to this this. I'm writing a data migration script and this is exactly what I need. For now I'll have to write a lame .next()/.update() loop to go over a set.

Comment by Reuben Garrett [ 14/Jan/13 ]

@Alex, is that https://jira.mongodb.org/browse/SERVER-3253?

Comment by Alex Piggott [ 14/Aug/12 ]

+1 for any of these ideas! I linked this discussion in https://jira.mongodb.org/browse/SERVER-325, the issue for implementing "$out" in the aggregation framework.

Comment by SlugFiller [ 12/Jun/12 ]

As I've suggested in the thread here:
https://groups.google.com/forum/?fromgroups#!topic/mongodb-user/AFhzlDMPfKY
I would like to suggest a simpler solution that wouldn't require any database-breaking changes, but add a great deal of flexibility.

The syntax, which is based on the "group" and "mapReduce" operations is this:
db.mycoll.update(

{mykey: "key_value"}

, {$func: {code: function(obj)

{/* Manipulate object */ return obj; }, scope: {myglobal: "global_value"}};

Similar syntax would be possible for "findAndModify". In languages without code-gen capabilities, "$func.code" would be a string containing the JavaScript code, or whatever appropriate method is used to pass functions into "group" or "mapReduce" (e.g. MongoCode class in PHP).
The "scope" parameter is used for injection-safe constant addition, similar to the "scope" parameter in "mapReduce". Drivers which support it may also allow specifying "code" and "scope" using a single driver-specific object for a function with a scope, e.g.:
array('$func' => new MongoCode('function(obj) {/* Manipulate object */ return obj; }

', array('myglobal' => 'global_value')))

This would be an atomic equivalent to (Removing the use of "scope" for brevity):
var obj = db.mycoll.findOne(

{mykey: "key_value"}

);
if (obj == null && !upsert) return;
var newobj = code(obj);
if (obj == null)
{
if (newobj != null)
db.mycoll.insert(newobj);
}
else
{
if (newobj != null)
db.mycoll.update({_id: obj._id}, newobj);
else
db.mycoll.remove();
}

Notice from above that "upsert" is supported, in which case "null" is passed to the function. The function may return a different object from the one passed, and may also return "null" to perform a remove operation.

The operation should have similar atomic properties as "$inc". In fact, it should be to "$inc" what "$where" is to "$gt" - a generalization (Except, unlike "$where", it doesn't carry an optimization penalty).
For this to be possible, however, the function may not call any database operations (contrary to some comments above), since the document must be locked while the function is running.
This is more useful than the alternative (Allow functions, but no atomic guarantees), since said alternative can easily be done using client-side code ("find", processing, and "update"). An atomic non-trivial update operation is currently not possible without using some simulated locking mechanism, making it far more useful as an additional operation.

And finally, here's a use-case example which cannot be done AT ALL using current methods:
var newvalue = db.mycoll.findAndModify({
query:

{key:"mykey"}

,
update: {$func: {code: function(doc) {
if (doc == null) return

{key:"mykey", value: 1}

;
doc.value += ((doc.value % 5) == 4)?2:1;
return doc;
}}},
new: true,
upsert: true
});

The above implements an auto-increment value that "skips" any values which evenly divide by 5. (In other words, it returns "1, 2, 3, 4, 6, 7, 8, 9, 11, 12, 13, 14, 16...")
Note that it is as atomic as an auto-increment value implemented using "$inc", but with the added functionality that is not possible with "$inc".

Comment by Tamas Nagy [ 03/Jun/12 ]

This would be extremely useful for updating statistics that shouldn't have to be calculated on the fly with either the aggregation framework or map/reduce. As mentioned above, along with this, some type of context or passable parameters would be nice. This would be extremely useful if combined with the speed and flexibility of the aggregation framework expressions as mentioned by Colin.

Comment by Simeon Simeonov [ 16/Apr/12 ]

Just wanted to add a +1 to specifying functions for update operations. I needed to build utilities in both JS and Ruby for this.

In addition to this, which provides access to the document, to make this work well for a broad range of uses without resorting to function codegen, the function should take an optional context argument that can be passed as part of the query.

Comment by Colin Mollenhour [ 14/Feb/12 ]

The aggregation framework introduced lots of nice non-JS operators (http://www.mongodb.org/display/DOCS/Aggregation+Framework+-+Expression+Reference) and field referencing (by prefixing field name with $). If these same techniques could be used for update operations that would be a huge improvement. (Is there a ticket for this already?) E.g.:

db.foo.update({},{$update:{avg:{$divide:["$total","$count"]}}},false,true);

I imagine this would be a lot faster than JS and have potentially much better concurrency. Also of note is that using the $out pipeline in the aggregation command might provide a faster method of accomplishing the same as the M/R hack.

Comment by Alex Piggott [ 10/Jan/12 ]

We do lots of server-side batch scripts of varying complexity (using a completely unsupported/bug feature in map/reduce, oops: http://groups.google.com/group/mongodb-user/browse_thread/thread/315d8eebea11dcf/d72a56a04d7e94e7#d72a56a04d7e94e7)

Just a vote for being able to pass in functions declared earlier in javascript, eg something like

px = function()

{ // processing on "this" }

db.coll.update({}, {$function: px})

(which I think it what is currently proposed - but others seemed to be suggesting passing strings around)

Since many of our batch scripts are long and complicated, using "proper looking" javascript is already at the edge of maintainability

(Obviously happy with strings when using the Java driver since there's no alternative!)

Comment by Rudie Dirkx [ 23/Nov/11 ]

If it's server-side – I believe you – how is

$function: function()

{this.avg = this.total / this.count}

being sent to the server? Converted to string? So no closures? I don't get it. But don't let that stop anyone! =)

I don't know what you mean by "and it loses any point due to lack of atomicity"...

Comment by Gleb [ 23/Nov/11 ]

I'm sure this is about server-side execution of a function in updates

Client-side you can do it easy anyway and it loses any point due to lack of atomicity.

Comment by Rudie Dirkx [ 23/Nov/11 ]

Who says the anonymous function is in Javascript? Pass the document as first argument so practically all languages can use it. No need to pass a Javascript function as string.

If you're going with solely Javascript, you're disadvantaging all other drivers. No fair!

Comment by Gleb [ 23/Nov/11 ]

Because the only driver that supports anonymous functions written in javascript would be the javascript one.

Comment by Rudie Dirkx [ 23/Nov/11 ]

Why so difficult? Why not:

collection.update(

{"_id": "foo"}

, function(doc) { return

{"avg": doc.total/doc.count}

})

and

collection.update(

{"_id": "foo"}

, {"$set": function(doc) { return

{"avg": doc.total/doc.count}

}})

(The drivers that don't support anonymous functions just shouldn't have this feature IMO.)

Comment by Thilo Planz [ 15/Sep/11 ]

For some simple scenarios, there may not even be the need to have Javascript at all. For example just copying values from other fields. There an option to refer to a current field value to use in $set would be sufficient. Or a $copy similar to $rename, except that it does not remove the old field. Is there already a JIRA for this?

Comment by Seth LaForge [ 14/Jul/11 ]

Something to think about: presumably the js will actually be passed as a string, at least for drivers which aren't the JS shell. It'd be nice to be able to pass parameters to the JS, in order to avoid SQL-injection type attacks. Perhaps something like this:

myCollection.update(query, { $eval: { $js: "function(newVal)

{ if (newVal > this.maxVal) this.maxVal = newVal; }

", $args: [myNewVal] } });

I don't know what the options are with the current JS engine, but it would be very nice if it were possible to execute these functions in a manner that doesn't require locking, presumably by restricting their context to only the object being updated. (Of course, this would break Milan's clever pseudo-join - perhaps a variant could allow more global operation, at the expense of more locking.)

Comment by Christian Hubert [ 20/May/11 ]

I imagine the update JS enabled operation like this :

  • Updating a field :

update( query, { ..., {$js :

{ fieldName : func( this ) -> value }

}, ... )

  • Update the entire object :

update( query, {$js-all : func( this ) -> newThis }, .... )

The idea is to have the choice to use javascript to calculate a single object field or to calculate the entire new object.

Anyway, this kind of possibilities ARE very important (specially for bulk ops), since the only alternative is now to download objects to client side for doing this... (I dont consider mapreduce possibilities are adapted for this kind of work).

Hope this (or alternate equivalent) will come soon !!

Thanks by advance and thanks for the great work you did with Mongo !

Christian

Comment by Andrei [ 17/May/11 ]

Thank you both for explaining, I understand now.

Comment by Eliot Horowitz (Inactive) [ 17/May/11 ]

The estimate is how much work is involved in implementing, not when it will be implemented.

Comment by Andrei [ 17/May/11 ]

pardon my ignorance, but what does "Estimate < 1 week" mean? Does it mean that this feature will be implemented in an unstable branch very soon?
Currently I'm employing some ugly hacks to get certain things done in mongodb, and this feature would solve my current bunch of problems.

Comment by Scott Hernandez (Inactive) [ 20/Nov/10 ]

It would be nice if this worked in a language/syntax which wasn't javascript so it could be faster and non-blocking.

Comment by Karoly Negyesi [ 05/Apr/10 ]

Dont forget to allow for xmlhttprequest <evil grin>

Edit: you know, that'd allow (later) for manual conflict resolution when sharding is coming out of a split-brain situation.

Edit2: this is just a crazy idea and not important.

Comment by Milan Iliev [ 17/Mar/10 ]

This will obviate the need for many an eval(), and allow for pseudo-join updates by doing something like

name: function(){ return db.users.find({_id: this.user_id}).name }

In short, very much looking forward to it

Generated at Thu Feb 08 02:54:10 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.