Document-level locking (SERVER-1241)

[SERVER-1240] Collection-level locking Created: 15/Jun/10  Updated: 25/Oct/14  Resolved: 22/Oct/14

Status: Closed
Project: Core Server
Component/s: Concurrency
Affects Version/s: None
Fix Version/s: 2.7.8

Type: Sub-task Priority: Major - P3
Reporter: Eliot Horowitz Assignee: Eliot Horowitz
Resolution: Fixed Votes: 387
Labels: None

Issue Links:
Depends
depends on SERVER-4328 db level locking Closed
is depended on by SERVER-679 Authentication should be non-blocking Closed
Related
related to SERVER-1241 Document-level locking Closed
related to SERVER-2563 When hitting disk, yield lock - phase 1 Closed
Tested
Participants:

 Comments   
Comment by Eliot Horowitz [ 12/Nov/10 ]

just a note to check auth is non-blocking when this is done

Comment by Nathan D Acuff [ 06/Dec/10 ]

We have some collections that are very write heavy and others that are almost all reads - this would be very, very good for us. We are currently considering a complicated solution that exploits sharding, if this one makes it into 1.8, we won't have to.

Comment by Eliot Horowitz [ 11/Dec/10 ]

This interacts with durability too much so need to do in serial.
Durability will be in 1.8, this in 2.0

Comment by Remon van Vliet [ 15/Feb/11 ]

This is a major issue for us as well. We're at a point where write lock % is unpractically high in production environments even though writes happen to quite a few different collections.

Also, where is the JIRA issue concerning durability?

Comment by Valery Khamenya [ 15/Feb/11 ]

@Remon, "collection level locking" could be at least resolved in ugly way – you create a separate db and server for such a collection.

What is not solved in any way is a record-level (row-level) locking – and that becomes pretty unpleasant.

see closed ticket http://jira.mongodb.org/browse/SERVER-1169

Comment by Remon van Vliet [ 15/Feb/11 ]

Valery, that's actually the solution we're chasing now but it has a lot of serious drawbacks.

Document level write locks for updates and deletes should be possible but I'd be happy with collection write locks for inserts, updates and deletes.

Comment by Valery Khamenya [ 15/Feb/11 ]

@Remon, i see.
however, who knows maybe if the locking would be of record-level, then you would never see your locked ratio reported "100%"?
maybe even never 1%?

Comment by Eliot Horowitz [ 16/Feb/11 ]

Pushed out in favor of SERVER-2563 which will be more beneficial in more situations.

Comment by Remon van Vliet [ 18/Feb/11 ]

Agreed, SERVER-2563 will remove the high lock contention we're seeing and is a less specific solution to the problem than collection level locking

Comment by Sandeep Mukhopadhyay [ 26/May/11 ]

Read performance is affected for the same which blocks us to use MongoDB in our production

Thanks
Sandy

Comment by Colin Mollenhour [ 12/Jun/11 ]

Was kinda shocked to see that Mongo didn't already have collection-level locking.. Using a single Mongo as a jack of all trades (app data, logging, job queue, analytics, etc..) maybe isn't such a great idea at the moment...

Comment by Travis Whitton [ 02/Sep/11 ]

Wondering if this is still planned now that yielding on disk writes has been implemented? We have a lot of use cases where collection level granularity would almost certainly improve performance as we're currently addressing lock contention through sharding multiple mongod instances on the same server.

Comment by Eliot Horowitz [ 02/Sep/11 ]

It is still planned.

Comment by Carl Mercier [ 30/Sep/11 ]

+1

Comment by Michael Tucker [ 08/Nov/11 ]

I have also been bitten by using Mongo as a jack of all trades, so +1 for me too.

Comment by Fredrik Björk [ 05/Apr/12 ]

Any progress on adding this?

Comment by Dwight Merriman [ 05/Apr/12 ]

see SERVER-4328 for starters

Comment by Valery Khamenya [ 25/Jul/12 ]

Guys, honestly, doing my best to stay positively tuned. But such a 2 years issue is a killer for thousands success stories. I don't know who sets prios on your side, but this issue is a huge fail. Pass my comment to this your "Steve Jobs", please.

Come on, guys, you created a great product, so why to ruin success this way? Frankly, was it that hard?

i have had one hour for meditations, but am still waiting:
db.currentOp();==>
{
...
"lockType" : "read",
"waitingForLock" : true,
"secs_running" : 4128,
"ns" : "mydb.collection_A",
"query" : {
"count" : "collection_A",
...
}
{
"lockType" : "write",
"waitingForLock" : false,
"secs_running" : 6724,
"op" : "remove",
"ns" : "mydb.collection_B",
...
}

guys, what you hear is totally unhappy voice. I do understand, that being totally unhappy is a clear sign of my personal failure in my own wrong expectation. However, don't miss your part of my failure
1. MongoDB was never sending signals like 'we are building academical experimental products please don't use it in real-life applications!"
2. "collection level locking" was a very basic high impact expectation.
3. MongoDB guys are too cool to build such a broken architecture, where "collection level locking" is hard to implement in 2 years.

Looks right?

And don't tell me about "we are open source", etc. If it was any lack of resources and/or problems, why didn't you inform us in last two years? I'm quite sure people (and poor me) would have proposed solutions. Also, don't tell me about things that have to fit in RAM, I have terabytes of data and I do fight for RAM.

Last but not least, nice google trend for MongoDB should be a signal for you, that great success comes with great responsibility. No matter what is your business model and how steep is your growth right now.

Please, release this feature ASAP.

P.S. devs that work hard on closing the prioritized issues are not addressed (thank you guys!). Those who set prios are addressed only.

Thanks.

Comment by Matt Parlane [ 25/Jul/12 ]

Valery, I'm getting sick of your trolling – shut up.

The 10gen guys have been very clear about their plans, db-level locking in 2.2 and collection-level locking in 2.4.

The pace of development is hugely impressive here, and you're also getting it for free so you have no right to complain.

Comment by Dwight Merriman [ 25/Jul/12 ]

@Valery consider using 2.2 and a db per collection (if you have a few collecitons) or bucketing your collections into N dbs to get some more concurrency if you have say, 5000 collections. 2.2rc0 is out so it's almost there.

from the snippeted output i can't tell if the problem has anything to do with concurrency. if the remove is (by intent) doing a large table scan, it will yield as it goes along, albeit it will take a long time to run if millions or billions of documents. i swould suggest discussing tunings in the support forums.

Comment by Valery Khamenya [ 25/Jul/12 ]

@Dwight, thanks for reply and for your great contributions to MongoDB project.

Well, I have 50 collections "only", but also several DBs. Multiplying the DB count will involve a lot of (unneeded?) re-engineering.

Regarding yielding: I've terminated the 2-hour long remove and then the count-query immediately succeeded.

TBH, it is still quite a pity that this issue is scheduled for 2.4 only. Dwight, i'm sorry for my long comment and a low amount of positivism, but I still wouldn't change a lot in that comment.

Comment by Philip Gatt [ 02/Oct/12 ]

@Valery makes some good points that I can relate to. I run a large site (alexa rank 800). We started using mongodb about 5 months ago for modeling follower lists. We're now considering migrating away from mongo because of the locking issues we've been facing.

With a standard SQL solution we don't get all of the JSON goodness that I love about mongo. What we do get is a server that can easily utilize multiple CPU cores without us needing to be concerned with sharding our data set.

It's very unfortunate that our mongo experiment may end as a failure, and when Valery offers you guys some necessary suggestions he gets called a troll. I hope mongo has a more mature community than Matt would make it appear.

Comment by Tuner [ 18/Oct/12 ]

Crossing my fingers to see this feature, I would want to use MongoDB more!

Comment by Eliot Horowitz [ 18/Oct/12 ]

With all of the changes in 2.2, would be great to see examples in 2.2 of any issues.
Not sure collection level locking really is the best next step as with the yielding, db locks, and other changes, at a disk level there is often only locking around single records.

So if anyone watching this ticket has any locking issues in 2.2, would be great to see them.

Comment by Vadim Voitsiakh [ 05/Dec/12 ]

Sorry for absence of statistics, precise numbers, but I surely can say the following things.
I'm using sharding with one big collection (>350m docs) divided on two servers and few non-sharded collections left on primary shard.
There are many inserts/updates/reads against big collection and due to this workload other non-sharded collections very suffer on simple reads (it was annoying to see that querying document from non-sharded collection by _id took more 100ms). This problem was solved by moving primary to third server.

Comment by Kevin J. Rice [ 26/Feb/13 ]

Echo Vadim almost:

  • lots of 4k docs in one collection;
  • locking is huge slowdown (induces delay in updates of single documents);
  • have 2 servers primary plus 2 servers for replicasets.
  • Achieved MASSIVE improvement of performance (Mongo 2.2.1) using 24 shards (12 each on 2 servers).
  • Improvment on scale of: max 8 writers w/ 12 shards; max 28 writers with 24 shards.
  • Now trying 48 shards, but running into max connections problems, so may go down to 24.
  • Note, MUST adjust oplogsize since 5% each * 24 per server > 100% disk space for oplogs alone (yuck, but easy to fix).

UPDATE: 48 shards works, but having to add virtual IP addresses to split the number of connections to any one IP to ensure we're less than 64K of 'em.

Comment by Oleksii Samodid [ 28/Mar/13 ]

Hello guys!
We are using MongoDB, and now we have faced this issue with database locks. Can you update me with current estimates of collection level locking implementation?
I'm asking because we need to decide, should we move each collection into different database, or just wait a while for collection level locking to be done.

Thanks in advance.

Comment by Michael Catanzariti [ 30/Mar/13 ]

Hi Kevin J. Rice,

I was wondering with a setup of multiple shards on the same machine if the mongod processes would not fight for memory. Especially if the data working set is larger than the amout of available RAM on the server.

Comment by Ben McCann [ 19/Apr/13 ]

This is a really frustrating issue. Any update on updating from "Planning Bucket A" to a scheduled release?

Comment by Fred Rogers [ 13/May/13 ]

Why is this not being worked on? We are constantly constrained by locking. mongostat is always reporting lock percentages between 50-100%. This design is super, super frustrating. This is the 2nd most requested feature/bug fix. We don't need new features if basic DB functionality won't work.

Comment by Fernando [ 15/May/13 ]

I've been following this specific ticket for a while now. It's the ONLY reason we backed out from using MongoDB in a production environment. I'm sure it's not an easy fix, but we moved to Couchbase 2.0 because of it.

Comment by Eliot Horowitz [ 15/May/13 ]

We're definitely working on this, its just in pieces.

Additionally, the cases we're generally seeing now (after the changes in 2.0 -> 2.4) wouldn't be helped by this as much as other changes related to concurrency around journalling, replication and sub-collection.

Can you describe the use case and what you're seeing? Want to make sure collection level locking is the right solution.

Fred: for example, what version are you on and can you send the logs for a sample?
Maybe a subticket of this ticket would be good.

Comment by Kurt Radwanski [ 02/Jul/13 ]

Given the default behavior of client libraries to return from a write as fast as possible and let the write happen in the background, I'm semi-concerned this might eventually get exploited by attackers in certain situations.

An example:

1. Your database is `mydb`. (all write locks hit this).
2. Your user table is User within `mydb`.
3. `User` has a field for "signature" or something else easily/readily changeable by a user.
4. Attacker repeatedly changes the field over and over again, as fast as possible. Multithread or otherwise distribute the requests. The web server still returns very quickly, masking the actual overhead of the write.
5. Reads on `mydb` by other clients increasingly become blocked by the resultant write locks, bringing the site to a standstill with relative ease.

Obviously you could mitigate the issue at the application level (e.g., rate-limiting, captchas, setting the flag to force journal flushing/write acks, etc...) or by restructuring to use multiple databases (still won't mitigate the same single PoF if someone's using pseudo-joins or another collection within the database needs frequent accessing).

It might be an idea to allow "unsafe reads" of some sort that can read through a write lock, but this would, at the very minimum, require updates to client libraries to support find() query flags (not to mention probably have implications in sharding).

Obviously this isn't as much an issue for infrequently updated or non-public-servicing databases, but at the very least, it makes sense that collection-level (and ideally, document-level) locking should be considered increasingly important--that is, if I'm understanding the current state of them correctly.

Comment by vivek bajpai [ 24/Jul/13 ]

I have also the same issue with collection level locking.It takes so many time to read and write on the same collection.In log it can be easily traced.
nscanned:4 nupdated:2 keyUpdates:4 numYields: 1 locks(micros) w:2461970 10275ms
nscanned:4 nupdated:2 keyUpdates:3 numYields: 1 locks(micros) w:2475463 8847ms
collection have only 70000 document but concurrent reads are high so updating a single document some time take more than 10 sec.
Any idea how to resolve it.
what i hv already done
1.Replication with sharding using 3 member replica set collection level sharding is also enable.
2.Each member placed in EC2 Medium instance.
3.Query is also bounded with indexes.

Comment by Thomas [ 18/Aug/13 ]

I would very much appreciate this.
A possible use case for this is given in the docs:
http://docs.mongodb.org/manual/use-cases/metadata-and-asset-management/
Think of a multi tenant CMS with this design and a collection (did I hear "bin"?!) for each tenant.
In such a scenario, the collections are completely isolated and consistency must
only be assured on a collection level.

Comment by Geert Pante [ 21/Nov/13 ]

I'm not sure if this is the correct issue.

My use case is the following:
We have two collections: tweet_share and access_log where we log filtered tweets and access logs for a big online newspaper. We have two aggregators that reads from these collections and summarizes per article by 10-minute interval, hour, day, month, and store these results in a third collection. If these collections are in the same database, writing access log entries is slowed down by writing aggregate results from the twitter collection to the summary collection and vice versa.

If we put each collection in a separate database, throughput is much higher. This is harder to manage, but it works...

Please make the write lock only holds for one collection at a time.
Is this the correct issue for this use case?

Comment by Vincent [ 22/Jan/14 ]

Could we have some sort of ETA for this? It's particularly problematic when you have small multicores CPU (say 2 cores/4 threads with hyperthreading, like an Atom N2800 or something), because it tends to be CPU bound (in my case secondaries can't keep up with updates because of this, MongoDB is using a single of the 4 apparent cores (ie 1/2 to 1/4 of the total CPU power), leading to incredibly poor performances – however primaries are fine with lower - but high - CPU usages, don't really understand why).
I'm considering moving collections to different databases, but that's a real pain to manage (not a big deal to do at the beginning, but managing the change is the real big deal).

Comment by Roman Sh [ 03/Feb/14 ]

Please, do not keep silent, say when will you tell about plans? Like many people here I have been waiting long time but I shall change the database to TokuMX in few next months if MongoDB does not have the document-level locking as well as TokuMX has.

Comment by msalvado [ 03/Feb/14 ]

From what it's worth, form a project that recently quit mongoDB for a competitor, my advice would be to quickly change the way you address your community needs and the way you expose your technical vision and strategy to your community. I can't believe that this particular issue has not been addressed yet. We're in 2014, people no longer stick to a tech for a lifetime. We now choose the best tool for the usage at a particular time, and we don't hesitate to migrate in the middle of the life of a product for another technology, when it addresses our business requirements and operational needs. It's too bad because I still find MongoDB has great features, it's just too limited right now when it comes to scaling writes.

Comment by Eliot Horowitz [ 03/Feb/14 ]

Hi,
Speaking to the higher level issue of write scalability, there are a number of things going on right now.

While 2.6 doesn't address lock granularity specifically, it has a lot lot work for write scalability and prep work that sets the stage for lock granularity work.

A couple of things in 2.6 that impact write scalability.
The new query execution engine uses multiple indexes for a single query, which in many cases greatly reduces the number of indexes needed (which is the largest impediment to write scaling).
Much of the work we used to do inside locks is now pulled out, such as query and update parsing and _id generation.
We've done a lot of work on improving oplog write concurrency, both by making each write faster and changing the locking around how this works. This improves concurrency in 2.6, but also was required before more granular locking would be beneficial or it would immediately bottleneck everything.

Of course lock granularity is also critical. 2.8 will definitely have finer grained locking, the specifics of which we'll lock down over the next couple of months.

In the meantime, there are lots of methods for increasing write scalability, so if anyone opens a new ticket, happy to help in a specific case.

-Eliot

Comment by Vincent [ 03/Feb/14 ]

I never heard about TokuMX, but now I simply hope MongoDB, Inc will use their $150M to buy that little company or integrate their changes (why not, as they are open sourced!).
This request (collection level locking) was created almost 4 years ago! And it is still not even planned yet...

Comment by Roger Binns [ 04/Feb/14 ]

We are also going to have to change database engines because of MongoDB's pitiful locking contention and performance. I'd love to see some hints as to what these "lots of methods" are since I can only see three:

  • Make each document bigger instead of several smaller ones
  • Use more databases
  • Use more machines (shards)

I did try TokuMX about a month ago. It had drastically smaller database size (mongodump bson was ~360GB, Toku only needed 53GB including indexes to store that). The import and index generation was a lot faster too. However it fell over with an internal locking error when running an overnight job (about 100:1 reads versus writes).

Comment by Pieter Willem Jordaan [ 17/Mar/14 ]

I made the mistake to assume that locking was the bottleneck. I tried TokuMX, and found no improvement, perhaps only a little better concurrency and smaller footprint. Upon further benchmarking, profiling and testing I found that it was my own bad design which led to the bad performance. Furthermore, I would not recommend TokuMX as it is currently prone to crashes on a "Stable" release.

I've been getting much (10-20%) improved performance with the new 2.6.0-rc0 branch.

Comment by Ben Brockway [ 17/Mar/14 ]

@Eliot Horowitz - Is there any updated documentation to support the multi-index queries enhancement that you mention? I would quite like to understand how it works as it may be incredibly beneficial for me. I can't see any information on this in the 2.6 docs (yet). Thanks

Comment by Stephen Steneker [ 17/Mar/14 ]

Ben Brockway: the new index intersection feature is mentioned in the 2.6 release notes under Query Engine Improvements.

If anyone has further questions on MongoDB 2.6 or other topics, can you please post to the mongodb-user discussion group? Comments on this Jira issue should be kept relevant to the "collection level locking" feature request, and your questions will also have more visibility on the community forum.

Thanks,
Stephen

Generated at Tue Nov 21 21:42:48 UTC 2017 using JIRA 7.2.10#72012-sha1:2651463a07e52d81c0fcf01da710ca333fcb42bc.