[SERVER-124] triggers Created: 02/Jul/09  Updated: 06/Apr/23

Status: Backlog
Project: Core Server
Component/s: Usability
Affects Version/s: None
Fix Version/s: None

Type: New Feature Priority: Major - P3
Reporter: Eliot Horowitz (Inactive) Assignee: Backlog - Query Execution
Resolution: Unresolved Votes: 249
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Duplicate
is duplicated by SERVER-30252 Write oplog operations to kafka Closed
is duplicated by SERVER-41735 Auditing Functionality for Mongo DB a... Closed
Related
related to SERVER-6895 TTL collections backups/archival Closed
related to SERVER-13755 Support for tailable cursors over non... Closed
related to SERVER-8777 update the data based on old values a... Closed
Assigned Teams:
Query Execution
Participants:
Case:

 Description   

insert/update/remove/missing object



 Comments   
Comment by Dissatisfied Former User [ 23/Feb/16 ]

As a user, not developer, triggers are a point of concern. Obviously not using them should incur no overhead, but they're very non-trivial to implement.

  • Pre- and post- hook points. (With pre- hooks potentially allowing the operation to be invalidated, i.e. pre-update indicating the update should not proceed.)
  • For all hooks, the document must be passed. It's not enough to say "ID X is about to be deleted", the trigger logic will need to be able to inspect the document.
  • Many triggers may want to perform larger-scale operations, such as delivering e-mail in response to a record change. Clearly, mongod isn't the right place to be doing that, so you might inject a "task" record of some kind which an external process watches for… at which point you might as well be doing the "trigger" monitoring there anyway.
  • The last point also leads to the potential for database-level amplification attacks. (Requiring careful coding.)
  • Using the new document validation mechanism as a method of filtering which operations execute a given trigger would then require iteration of candidates and repeated evaluation; not so efficient, with that overhead added to every call of every operation that is hooked.
  • Like validation hooks would require some method to bypass, plus assorted tool changes to control trigger execution during import/restore.

There are existing ODM/DAO layers which provide signal/trigger/callback functionality, such as MongoEngine (Python) or, for the oplog approach, libraries such as mongo-oplog, Coccyx, or moplog (JavaScript examples).

Pre-aggregation via upsert operations is effectively the "view" process in MongoDB. Pairing standard inserts with their pre-aggregate updates typically works well for that without the need to asynchonrously divorce the pre-aggregate update from the insert, no? This very much sounds like a problem solved at the client driver (application) level and not server-side.

Comment by Vicary Archangel [ 12/Jun/14 ]

It's been around for a few years, any thoughts from the developers?

Comment by Vicary Archangel [ 16/Aug/13 ]

+1 Server scripts (Node.js, Java and else) would definitely take advantage from this, saving enormous resources from polling, while page scripts (PHP, Ruby ... etc) doesn't really care as they make queries every single HTTP request.

Comment by Joel Sanderson [ 30/Jul/13 ]

+1. I would like this feature to be able to maintain consistency of duplicate data across collections. Another usefulness would be able to execute some server-side JavaScript that would update a "view" as new data gets inserted/updated/deleted, similar to Couchbase views I believe.

Comment by Flavius Aspra [ 06/Mar/13 ]

This feature would be extremely useful, particularily if there would be a trigger when starting the daemon.

Comment by Mark Waschkowski [ 08/May/12 ]

Our company is soon needing this kind of functionality.

Issues seen from previous comments, but not yet addressed:
-delete's only show the oid (not anything else about the element that was deleted), so how can you create triggers around deletes? You would really need to know more about deletes to trigger from, and doing logical deletes is a time wasting work around
-sharding, and having to worry about multiple oplogs
-optimizations not available (search for 'interesting optimizations' from John Crenshaw)
-oplog format isn't a portable solution (format may change in the future), so its really more of a hack than a real solution

Eliot, do you really want the all the developers out there needed trigger functionality parsing the oplog? If so, fine, I'll write my own version of code to do so. If not, can you me know what would be required to move this forward? My company wouldn't be able to afford funding this alone, but would have funds to put towards this.

Comment by OptimusPrime [ 08/May/12 ]

+1. Think that you have to check the XSD before inserting an JSON'ed XML document! I know you will say that I can do it in my application. But remember that may be I have only one Application Server but several MongoDB servers.
And here is another reason:
An application is constantly inserting data into MongoDB, and a seperate application is watching it live. How can second application know which rows inserted? Please don't tell me to use timestamp or something like that and make periodic queries.
There is a whole bunch of examples out there which actually "need" triggers.

Comment by Yuri Finkelstein [ 02/May/12 ]

This jira
https://jira.mongodb.org/browse/SERVER-5042
would provide everything ordinary triggers can provide and more since mongo will also do trigger filtering.

Eliot, polling oplog is not a portable solution (oplog format, table names, etc can change). Instead, - enhance your client APIs to receive change notifications in a standard, type-safe manner and custom filtering on the server (if required)

Comment by Tuner [ 16/Apr/12 ]

+1

Comment by Eliot Horowitz (Inactive) [ 16/Apr/12 ]

A lot of people use the oplog for getting notified of changes.

Comment by Ben McCann [ 16/Apr/12 ]

CouchDB has _changes, which allows it to have very good elasticsearch support. I'd love to see a MongoDB Elasticsearch River, but some sort of post-commit hook would be needed first.

Comment by Eric Mill [ 07/Mar/12 ]

To backup Magnus' comment from 2009, and Roly's more recent comment –

This would be perfect for supporting automatic denormalization with much less code complexity and maintenance. It would also make syncing data with other databases (like ElasticSearch, which I also use) so much easier. Supporting this at the database level, rather than in an ODM or other client library, is by far the most flexible and efficient, and would make a new class of ODM features and plugins possible.

Comment by Mike Giardinelli [ 04/Mar/12 ]

+1 please add....would be very beneficial feature and as noted many times before competing products already offer this.

Comment by Yosef Dinerstein [ 20/Feb/12 ]

+1 Very important feature.

Comment by Jeff Barczewski [ 16/Feb/12 ]

+1 Adding this feature to MongoDB would be game over for most of the other noSQL engines. This is the feature that pushes people to other engines, once it is here too, then no reason to go elsewhere IMHO.

Comment by Roly Vicaria [ 30/Dec/11 ]

It seems like this this is the biggest obstacle to integration with ElasticSearch. Even something like CouchDB's Changes API would be useful.

Comment by Nick [ 29/Nov/11 ]

+1 for this feature, much easier than using Tailable Cursors

Comment by Cem Karan [ 14/Oct/11 ]

I agree with Michael Robinson. This will also make different types of replication/synchronization easier (think optimistic replication or BASE). Right now, the only solution is to create a fake server that communicates with the real server, and offers trigger-like capabilities. NOT optimal!

Comment by Michael Robinson [ 11/Oct/11 ]

There should be pre and post operation triggers. To make the most out of triggers, they need to be able to abort insert and update operations. This would make triggers a flexible solution for apps that require some kind of schema validation.

Comment by Steven Ceuppens [ 04/Oct/11 ]

Hi All,

mongoDB looks like an awesome DB, i want to use more in the future!

and... Triggers would make life more easy for a lot of developers... (i think)

At this moment i'm designing a new application that could need (again) this kind of future.

for a lot of functions like logging, notifying agents, db checks, ... it would be very handy.

Hopefully there will be an integration soon!

In the moment of, are there some alternatives whit mongoDB ? i would like to integrate mongoDB in a project

whit a development time of 1 year, and we start developing in about 1 month.. so.. i cant just wait till

ist supported, and urgently need to make some decisions...

Rgds Steven

Comment by drapeko [ 08/Jul/11 ]

it would be very great helper for rollup collection implementations (e.g. stats in real time)..

now it's really a trick as implementation has to be transactional

Comment by John Crenshaw [ 07/Jul/11 ]

After Paris's comments I just realized that nothing above defines what the parameters to the trigger function would be. I think the best option would be to pass an event object, with a structure similar to the following:

{
collection: ...name of collection, or alternately the collection object...,
type: "insert" | "update" | "delete",
recordId: ObjectId(...),
oldRecord:

{...document...},
newRecord: {...document...}

,
update:

{...$set/$inc/etc. operations...}

,
...Any other properties that make sense (query or whatever)...,
...Any methods that make sense (preventDefault() or whatever)...
}

Using getters to expose the oldRecord, newRecord, and/or update fields would allow the implementation to avoid unnecessary overhead when the trigger doesn't actually use them.

"this" should probably be set to the collection object.

I think this is a rough minimum. Without old/new (and/or update) many triggers will be crippled. Triggers to maintain a tombstone collection will probably be common, and will only need the collection name and id (so optimizations to avoid needless overhead on the larger values make sense.) Type is useful for keeping code DRY.

Thoughts?

Comment by John Crenshaw [ 07/Jul/11 ]

@Paris, presumably you should still be able to do that capped collection + MQ thing with triggers. Just set up a simple trigger to push data to the queue collection. Triggers and subscribing to events are similar enough at the lowest levels that either one would allow a user mode implementation of the other.

Comment by Paris Stamatopoulos [ 07/Jul/11 ]

@Keith Fair enough! I rest my case..I saw it as an opportunity to suggest an altermative and I dont want to impose. Triggers are definately important and my +1 definately stands!

Comment by Keith Branton [ 07/Jul/11 ]

@Paris, Let's not lose sight of the fact that this ticket is for triggers. Triggers have very specific meaning in the database world: http://en.wikipedia.org/wiki/Database_trigger. I suspect most (if not all) of the votes on this ticket are for database triggers.

If a capped collection would be sufficient for a message queue, then presumably a trigger could add records to such a capped collection and so accomplish this goal too.

One of the things I would want to do with triggers is to make the database responsible for computing and storing values that break 1nf - such as n/sum/max/min/mean of the values in an array and store these so they can be queried on. As application complexity increases, and more pieces of code change collections, (not always being written in the same language) it can be increasingly difficult to enforce the integrity of trivial-to-compute data fields like this throughout the application. Materializing these computable (duplicate) values in the database is often simply an optimization for most DBMS (though it is necessary to get certain queries to work in Mongo). I don't want to set up a message queue, and set up yet another mongo instance because I use sharding, and run a background task polling it so I can keep the number of items in a given array up to date whenever a document changes - that's way too much work.

Since a reasonably complex object graph can easily be represented by a single document in mongo the "we should not see mongodb as a RDBMS replacement" argument is rather moot. I for one use mongo for operational data, cache, queues, logging and image storage. It results in a super-simple stack that is easy to replicate on developer machines and easy to manage in production. It works very well in applications that do not require transactions (though I occasionally wish it had them) and where joins are seldom required (I tended to design for minimal joins on Oracle too - in heavy use it scales much better without them)

My only real concern is with using javascript functions - javascript generally performs very poorly.

Comment by Paris Stamatopoulos [ 07/Jul/11 ]

You are right on the most part, especially regarding a trigger on a sharded environment. But the concept of having a capped collection containing the changes in your entire dataset opens the possibility to a publish/subscribe interface that could be extended to a more asynchronous environment. Nowadays the amount of data is getting rather huge and we usually only want to get as less information as possible down to the application and eventually to the end user.

A more strict RDBMS trigger from what I understand would restrict the event being raised inside mongo while it could be propagated further down an application.

Furthermore what could also work on a sharded environment would be to have a separate mongod instance, like the configuration servers, to optionally propagate changes from triggers. So all replicas would propagate the changes to those. (So that you wouldn't have to have one separate oplog per replica set)

Finally, in my perspective we should not see mongodb as a RDBMS replacement, and we should choose Mongodb or any other document database for the unique features it has and not for the similarities to a system that we know already

Comment by John Crenshaw [ 07/Jul/11 ]

I'm not sure what would actually be gained by only implementing an oplog. Isn't this basically just the same as triggers, but without support for conditions?

As far as allowing a map action, the standard query syntax is more clean and familiar. If you need the level of power available from a map-like filter, a $where clause in the query should give you that. If you want an oplog, just exclude the query parameter (it is optional in my proposed syntax).

Also, using the typical query syntax keeps the door open to interesting optimizations. For example, if the shard key is included in the query the trigger could be limited to only those mongod instances that may contain the given key, or the query could be optimized to do the least expensive comparisons (boolean, integer) before more expensive ones (strings, $where, regex) when attempting to determine whether a trigger applies to a record. A oplog monitor closes this door forever and guarantees that triggers always have the maximum possible overhead (this doesn't seem ideal).

Finally something more trigger-like and less oplog-like eases the transition from traditional RDBMSs, which is largely in keeping with what Mongo has done so far.

Comment by Paris Stamatopoulos [ 07/Jul/11 ]

+1 However I don't think that there should be a traditional RDBMS trigger in the form John Crenshaw mentions above. I think what it would be best to have a mechanism to subscribe to an oplog like system, thus receiving all events occurring on a collection. Filters could be created via JavaScript to actually get only the data you need, in case you want to build a listening server (imaging a node.js server emiting events on mongodb collection changes) or if you want to actually to have a trigger like functionality, in the concept of MapReduce to have a MapAction functionality. (Map the changes, and perform actions triggered by them).

One could finally apply the filter/ MapAction 'before' or 'after' the event.

Comment by Adam Walczak [ 05/Jul/11 ]

+1 this would give some basic messaging capabilities like in Redis

Comment by Kaspar Fischer [ 26/Jun/11 ]

+1 for this feature (reason: ability to index content in Solr or Elastic Search, or other such technology)

Comment by Sean Malloy [ 22/Jun/11 ]

+1 for a feature

Implementation similar to the change notifications in couch

Comment by Aditya Kulkarni [ 14/Jun/11 ]

+1 for this feature.

This helps me easy integration with our other systems which can be indexing, analytic on data.
http://guide.couchdb.org/draft/notifications.html

For now, since feature is not available, any hacks?

Comment by Harald Lapp [ 08/Jun/11 ]

We too went away from tailing the oplog, because we found it to be to error-prone relying upon a mechanism which wasn't intended for such things. A trigger would indeed be highly appreciated.

Comment by Colin Mollenhour [ 08/Jun/11 ]

Another reason the "why don't you tail the oplog" solution is not a very good one is when you are sharding now your application has to be fully sharding-aware and you'll have to have separate threads watching the oplog of each shard or have copies of your application installed on each shard to watch the local oplog. Having a trigger mechanism would make it easy again by having one high-level interface even if it just amounted to filtering ops and aggregating them on one tailable collection.

Comment by Dan Terrill [ 12/Apr/11 ]

+1 for Magnus's syntax

Comment by John Crenshaw [ 11/Apr/11 ]

Transactions (and/or loose "write batches") only have a small amount of overlap with this. They don't really solve the same set of problems that triggers solve. Don't get me wrong, any sort of transactional support in Mongo would be welcome, but I don't really see this as the same issue.

The main reason I want triggers is to move special indexing (including multi-sharding, field computation, and normalization) logic to the database layer, where it belongs. Right now any type of complex indexing process requires some heavy code in the application. Extreme levels of paranoia are involved in making sure that the code to maintain the index never gets circumvented. Whether or not the index updates are guaranteed eventually consistent is the smallest part of the problem.

Comment by Andrew Armstrong [ 11/Apr/11 ]

I just realised that this ticket is pretty similar to my suggestion of 'transactional write batching' at SERVER-2804

Personally I'd be leaning to write batching instead (I'd rather logic stays in the application rather than a database - triggers are the devil ) but have voted for these anyway as they would definitely be useful if write batching can't be done.

Comment by Raviv Pavel [ 16/Mar/11 ]

+1

Comment by Barry Kaplan [ 08/Mar/11 ]

One thing to be aware of when using the oplog to send events: For deletes you will only have the OID available.

Comment by Zach Smith [ 08/Mar/11 ]

+1

I would love to see this implemented.

Comment by Jonas Lindholm [ 11/Jan/11 ]

+1

This is a much wanted feature and I hope it will be in the next major release.

Comment by Barry Kaplan [ 19/Dec/10 ]

For certain applications I agree with Tim. I created a simple oplog observer that sends out CRUD events. Quit trivial actually.

Comment by Tim Hawkins [ 19/Dec/10 ]

im not sure triggers are the correct solution to this issue, i personally believe that a slave framework that can act as an oplog reader, but have pluggable actions attached to it would be a better way of handling this. that would allow all sorts of integrations outlined above to be performed without having to add any specific support to the core system.

1) Mirroring systems, like logs, fulltext indexes, even mirroring changes to other databases.
2) Converting events in mongodb to messagebus events etc.

Something simple like a daemon that can spawn other processes passing Json/bson changes, simular to fcgi would allow folks to attach anything they like to the oplog.

Comment by Jeremy [ 08/Dec/10 ]

I agree. This would help.

To Matts & other's point, there were times when I thought I was using the wrong DB when I am missing SQL features such. But, I like the Mongo approach which seems to be, if it is possible, why not do it as long as it doesn't compromise anything.

Here is what I would like to use it for:

http://jira.mongodb.org/browse/SERVER-1650
http://groups.google.com/group/mongodb-user/browse_thread/thread/9276087d9cfc4741/e03ad48cbbaf5b5d?lnk=gst&q=polling#e03ad48cbbaf5b5d

Comment by Barry Kaplan [ 15/Oct/10 ]

John Crenshaw is right the mark. This would be a wonderful api/syntax.

Comment by John Crenshaw [ 14/Oct/10 ]

+1

Shouldn't a basic version of this feature be trivial to implement?

@Matt, This is more important because Mongo isn't an RDBMS. Document databases like Mongo use data duplication to solve problems that relational databases solve with joins. Triggers would allow this to be handled trivially, with greater robustness. I can't imagine this would slow down your database. It should only require one extra op when no triggers are defined (you won't notice a single op).

@sandstorm, Triggers in the application layer look like a good idea...until you start needing to do batch updates and/or complex atomic operations. I can't imagine an application of any reasonable size that doesn't eventually need to update a number of records at once. If triggers are handled at the application level, the application has to fully read in every record it modifies, process any triggers, and write it out. Aside from being more work to code, this causes a ton of needless IO.

My vote is for the following syntax (similar to Magnus's, but simpler in the common cases):
db.MyCollection.AddTrigger(

{ trigger : <triggerfunction> [, type : <'insert' | 'update' | 'delete' | 'all' (default) >] [, query : <query filter object>] [, async : <true if the trigger can be completed asynchronously, false (default) if the trigger must be completed as part of the update/insert/delete>] }

);

An optional "fields" parameter (that works like the fields in find()) could be added iff it would reduce resource use (Memory, CPU, IO).

Comment by Kostyantin Voz [ 30/May/10 ]

I've understood your "humble opinion" )) In large projects there are massage buses or event oriented architecture, what if integrate notification in a way that it could be disabled, and won't affect performance, of could be enabled and will waste some resources - but that will be lover amount of them, in comparison to creating fully functional massage bus in every project, which needs it.

Comment by Mark Waschkowski [ 29/May/10 ]

-1 to Matt Insler's comment

Comment by Matt Insler [ 29/May/10 ]

In my humble opinion, I think you guys are might be using the wrong database for your problem set, or maybe you're looking at your problem set in too much of an RDBMS light. Audit tables can be created by things other than triggers. In mongo, I would just tail the replication log and persist that to a non-capped collection. Or, you can use your slow triggers in your MySQL database to write audits into a Mongo collection, which will then be many many many times faster to query. A big thing to realize is that Mongo is eventually consistent. If you're using triggers to create audit tables for financial transaction information or audits that really really need to be there, then password your database and write the audits into your application layer. I disagree with any trigger mechanism that will slow down my database. If there must be a notification layer, then I think it should be a capped collection that any client can use a tailable cursor on. This would be the fastest from a database perspective and just as flexible as any suggestion here. I moved away from Oracle and MySQL because I didn't like the monolithic approach anymore, where I was pinging web services and sending emails and using wget from within PL/SQL. This just doesn't scale. You can easily use the replication log or something akin to amazon sqs to distribute the "trigger" operations horizontally across your entire infrastructure. Just think of how much nicer that would be for scalability!

Comment by Ted X Toth [ 11/May/10 ]

I'd like to be able to update a user interface when another app updates a db, +1.

Comment by Kostyantin Voz [ 14/Mar/10 ]

But as there are no c++ stored procedures, and I'm sure that not easy to implement, external message bus seems the only way...

Comment by Kostyantin Voz [ 14/Mar/10 ]

I like this feature, but I would like to have an ability to use c++ for that (to use 3-d party dll's) and be sure that that code will be executed on the same machine as corresponding mongod process. I don't like the idea with client modification, that's not enough. I like postgres trigger feature with calling stored procedures writen in c++, but external message bus looks more flexible (maybe I'm wrong, and there are no difference in flexibility, but I'm sure that external message bus will be slower).

Comment by Mark Waschkowski [ 10/Mar/10 ]

"My desire for this feature is to publish changes to user interface clients. For simple entity updates its easy enough to handle in the application. But when criteria based updates are used there's not much the application can do to notify clients of the changes."

Good point. This feature would be ideal for pushing changes through to clients rather than some ad hoc polling mechanism.

Another use case: when doing data imports. The import routine really shouldn't need to know about the business rules of a system, and yet if triggers are attached to the database itself, business operations (say, an email) can still be carried out without the import worrying about a thing.

I do have need of triggers as well, voting for this.

Comment by Barry Kaplan [ 10/Mar/10 ]

My desire for this feature is to publish changes to user interface clients. For simple entity updates its easy enough to handle in the application. But when criteria based updates are used there's not much the application can do to notify clients of the changes.

Comment by Jonathan Moss [ 23/Feb/10 ]

+1 on Magnus Persson suggestion,

I really like the idea of notifications based on selection criteria. The using stored JavaScript functions to to listen for notifications and act accordingly. Allows for some very flexible arrangements.

Comment by Mark Waschkowski [ 19/Feb/10 ]

Audit logs are really easy with this kind of feature, but there are a number of other use cases mentioned in this Jira that need a trigger mechanism

Comment by Valentin [ 11/Dec/09 ]

It will be really helpful.

Because of limited queries, we sometimes need tables to have some denormalized data, just to help searching. For example, we can store array size in separate field to allow queiring with size > or < constant using indexes. Without triggers having this fields can be a headache.

Comment by Magnus Persson [ 25/Sep/09 ]

+1 on this feature. I for one would find it very beneficial if events from mongo could be aggregated out on a message bus-like thing but might be further down the line .. . braindump follows...

The simplest form of these triggers could be something like PostgreSQLs notifications: db.myCollection.addNotification("insert", prototype, "notificationName") where prototype is a query-like object which when matched against the operation object(s) leads to mongo notifying listeners. These could register either globally or per collection: db.myCollection.addListener("notificationName", function(notificationData)

{ ... }

). Other semantics could allow these events to be triggered pre and post the operation being processed, additionally allowing listeners to modify the objects before they are stored.

Notifications could also be prioritzed to have some control over the order in which listeners are called. This could be an extra parameter to the addNotification function or included as metadata to the prototype object?

Triggers in the application layer is imho "okay" but consider the situation where you do not have access to it. Here, without triggers, you need to poll the database for changes (performance concerns) and also add infrastructure to determine when and how the database has changed.

Comment by sandstrom [ 14/Jul/09 ]

Sounds like a relationship database to me. I think that query/insert (eg. the storage part) is the important thing. Triggers are often happier in the application layer.

Generated at Thu Feb 08 02:53:05 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.