[SERVER-10723] Bulk insert is slow in sharded environment Created: 10/Sep/13 Updated: 09/Jul/16 Resolved: 06/Nov/13 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Sharding |
| Affects Version/s: | 2.4.5 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Critical - P2 |
| Reporter: | Anton Kozak | Assignee: | Asya Kamsky |
| Resolution: | Duplicate | Votes: | 9 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Linux Centos 2.3, sharded environment (12 shards), sharding was done based on hashed key with pre-splitting. |
||
| Attachments: |
|
||||||||||||
| Issue Links: |
|
||||||||||||
| Operating System: | ALL | ||||||||||||
| Participants: | |||||||||||||
| Description |
|
We've faced a performance issue recently: bulk insert operation works slow in sharded environment: |
| Comments |
| Comment by Anton Kozak [ 04/Mar/14 ] | |||
|
It's a very good news, thank you very much! | |||
| Comment by Daniel Pasette (Inactive) [ 04/Mar/14 ] | |||
|
For those watching this ticket, we just committed a very large perf improvement for ordered bulk writes which will be in 2.6.0-rc1: | |||
| Comment by Anton Kozak [ 24/Feb/14 ] | |||
|
Hi Asya, Yes, all shards were started on the same machine, the idea was to understand what performance does new feature actually give and provide early feedback, my current project really needs fast bulk inserts/updates. Today I rerun the tool again trying to avoid external impact (antivirus, updates, another processes, etc), the results are better but not excellent: single inserts Unfortunately I cannot deploy rc-0 to any of existed environment but I'm going to create simplified environment on AWS for this testing. We haven't tested ACKNOWLEDGED on 2.6.0, but on 2.4.8 this level causes serious performance degradation in case of many shards (12 shards in our case): UNACKNOWLEDGED/NORMAL It would be great to find somewhere the results of the performance testing done by MongoDB (synthetic testing on optimized environment). Is it available somewhere? Thank you, | |||
| Comment by D.H.J. Takken [ 24/Feb/14 ] | |||
|
Could anyone please add a remark in the db.collection.insert() documentation about this issue? That would clear up the performance issues for quite a few people, I guess. | |||
| Comment by Asya Kamsky [ 24/Feb/14 ] | |||
|
It looks like you are running all the servers (shards, config and mongos and your java process) on the same machine (localhost). Can you try it on a regular set-up? Realistic test should probably insert with more than a single thread, using separate machines and not use smallfiles option on the shards since that'll slow things down significantly when inserting a lot of new data. "Acknowledged" having higher latency than "unacknowledged" should be expected - is it significantly different than in previous versions? Do you have a separate test for that? | |||
| Comment by Anton Kozak [ 23/Feb/14 ] | |||
|
Please reopen this issue, unfortunately 2.6.0-rc0/mongo-java-driver:2.12.0-rc0 don't resolve it. 2.6.0- old bulk (collection.insert(values)) 2.6.0-unordered-bulk 2.6.0-single inserts There is another issue with ACKNOWLEDGED write concern, it's an extremely slow in sharded environment (in 8-9 times slower than UNACKNOWLEDGED/Normal). Please share the results of performance testing which you've made (different modes and write concerns). Thanks, | |||
| Comment by David Storch [ 06/Nov/13 ] | |||
|
Resolving as duplicate. If you feel that the issues brought up in this ticket have not been adequately addressed, feel free to re-open. | |||
| Comment by David Storch [ 05/Nov/13 ] | |||
|
shadowman131/avish: order can indeed matter, and preserving the current behavior is important to make this as safe as possible. For example, a single operation in a batch of updates can depend on previous operations. In the insert scenario, duplicate keys depend on order inserted. The pre-2.4 behavior did not allow a way to express that order matters—however, the default behavior was that it does matter (the behavior of a standalone vs. sharded collection should be the same). In 2.6 it is being made opt-in as it is less restrictive and allows opportunities for parallelization, which is exactly what we're taking advantage of when the user indicates "ordered: false". We will make sure that this is well-documented, so that the expected behavior is clear in both the "ordered: true" and "ordered: false" cases. Strict ordering has always been the default for batch inserts, so unfortunately changing this default would be backwards breaking. Some applications rely on an ordering guarantee. Consider the example of an event logging application where successive documents contain monotonically increasing timestamps. In order to maintain consistency of the data, the documents must be inserted in order. Imagine taking a snapshot of the data during an unordered batch insert: the snapshot will not necessarily represent a point in time on the event timeline that the application intends to record. I definitely understand the point that an unordered (and therefore as fast as possible) batch operation could be a better default. However, it is certainly safer to have ordered as the default, which is why the system currently works as it does. We are hearing your feedback, and I will continue to discuss the feature developer in order to ensure that we take the best approach moving forward. | |||
| Comment by Walt Woods [ 05/Nov/13 ] | |||
|
If batches aren't sent in parallel in an unordered implementation, that's a pretty big oversight. | |||
| Comment by Avishay Lavie [ 05/Nov/13 ] | |||
|
I agree with the previous comments that the requirement for preserving documents order during batch insert was not really there in the first place (i.e. it wasn't clearly documented or communicated); in fact this issue exists only because users implicitly expected batch inserts to be unordered and to happen as fast as possible, and were surprised to discover this wasn't the case. So +1 from me for having the default be "make this batched insert as fast as you can regardless of order", especially if the write concern is able to report exactly which inserts worked and which didn't, which avoids any use case for preserving order. I realize that this might be considered a breaking change but I think it's worthwhile to try to understand how many users actually rely on this behavior; my guess would be that most users of batch inserts in sharded environment either don't care about this issue or would be surprised to discover this isn't already the default behavior. A more important point IMO is that in unordered mode (regardless of whether it is the default), batches sent to different shards should probably be sent in parallel to gain a significant performance boost similar to that gained when making a non-targeted query in a sharded cluster. | |||
| Comment by Thanos Angelatos [ 05/Nov/13 ] | |||
|
Hello David, Appreciate you taking the time to respond, unfortunately your answer is focusing on the new write commands that are internal to the way drivers work and not to the existing user code and patterns that rely on defined behaviour of your drivers. Since "ever" (is this a very big word? maybe) the purpose of having a batch insert is to fire as fast as possible a barrage of writes against a MongoDB. Is ordering necessary? Maybe yes, in specific occasions, but usually, no. In this I agree with the previous comment and I do not understand why "MongoDB must honor the requirement that the inserts happen in the order that they are specified" should apply in this case - it was never there in the first place on the bulk insertions, why introduce this now? | |||
| Comment by Walt Woods [ 05/Nov/13 ] | |||
|
Hi David, I don't understand any situation when the ordering of a batch insert would matter? Could you please enlighten us as to how a system would rely on batch inserts being inserted in the order they are specified? What I find particularly weird about this (seeming double standard) is that when reading from shards (without a sort specified), the data can come back in any order. I'd go so far as to say that is the expected behavior in a sharded environment for speed reasons. It seems very unreasonable to assume (or care) that a batch insert happens exactly in the order specified. The only situation I can think of where this would matter is the one where a single bulk insert call contains the same _id twice. This seems very easy to rectify - just have the client drivers reject the second, duplicate _id insert. | |||
| Comment by David Storch [ 05/Nov/13 ] | |||
|
Hi Thanos, In 2.6, all drivers will use the new write commands for their existing batch insert API. You will not need to change your application's batch insert in order to make use of the new write commands. The "ordered" field will be optional with a default of "true", so it does mean that applications will have to specify "ordered: false" in order to take advantage of unordered fast batch inserts. Unfortunately we must keep the default as "ordered: false"—we do not want the behavior of existing applications that rely on strict ordering of batch inserts to change in unexpected ways. | |||
| Comment by Thanos Angelatos [ 05/Nov/13 ] | |||
|
Hello David, Thanks for the detailed response. We do face this problem and what I do not understand and you have not explained is why you're introducing functionality (ordered inserts, getLastError) that was not existing in the past - I am referring to batch updates pro-2.4.5 - as default for the new fix version and not have it as an option so existing systems that expect the unordered but fast batch inserts need to be changed. thanks for your take on this | |||
| Comment by David Storch [ 04/Nov/13 ] | |||
|
Hi all, Apologies for the delayed response. The short answer is that this will be fixed in version 2.6 with For batch inserts, MongoDB must honor the requirement that the inserts happen in the order that they are specified. The consequence is that consecutive inserts can be batched if they correspond to the same shard. First consider the case where "k" is the shard key. There are two shards, corresponding to ranges
In the case of a hashed keys, inserted documents will be distributed more or less randomly among the shards, which means that very little batching can take place. The new bulk write commands to added for | |||
| Comment by Christos Soulios [ 15/Oct/13 ] | |||
|
We have seen exactly the same performance degradation when we upgraded our database from 2.2 to 2.4, while changing our sharding strategy from range to hash. Is there any plan for solving this issue soon? | |||
| Comment by Walt Woods [ 03/Oct/13 ] | |||
|
For what it's worth, I ran some tests on our system and verified this incredibly bizarre behavior. Running a larger sample right now, but for a small set of documents on a 3 shard server, I got a 10x speedup just by sorting the documents before passing them to PyMongo's insert() function. Never thought I'd spend part of a day coding in a custom look-at-what-chunk-each-document-will-be-in-and-sort function for my database driver... | |||
| Comment by Avishay Lavie [ 03/Oct/13 ] | |||
|
Looking at strategy_shard.cpp I see that mongos groups together inserts to the same shard as long as they're consecutive (in the original list of documents given to the batch insert command), i.e. it maintains the original document order. I would expect a sharded batch insert operation to do a parallel scatter-gather operation, so order shouldn't matter anyway. So, the question is: Is there a reason to maintain the original ordering of documents in the batch, at the expense of performance? I assume it wouldn't be too hard to pre-sort the documents in the batch according to their assigned shards so as to minimize number of insert-groups (ideally having no more than one group per shard). Would you be willing to look at a pull request implementing such pre-ordering, or is the current implementation considered the desired behavior? | |||
| Comment by Avishay Lavie [ 03/Oct/13 ] | |||
|
I'm suffering from this as well. Whenever a batch insert targets more than a single shard, I see about 10x drop in performance. | |||
| Comment by D.H.J. Takken [ 19/Sep/13 ] | |||
|
Just checked the effect of sorting on bulk insert performance. I can confirm that sorting the documents on the sharding key in the application improves bulk insert speed quite a bit. Note that my sharding key is a compound key, which has a hash as its rightmost component. So, the sharding keys of the documents generated by the application have a high degree of randomness. | |||
| Comment by D.H.J. Takken [ 19/Sep/13 ] | |||
|
I am seeing the exact same thing on a cluster with four shards, running MongoDB 2.4.6 on Ubuntu Linux 12.10. Each shard runs on its own physical machine, as does the mongos instance. I use pymongo to insert documents. Simply replacing batch insert commands with for loops doubles insert speed. Bulk inserting on an empty cluster is fast at first. As chunks get distributed across the shards, insert speed drops rapidly. The more shards that mongos needs to distribute the data to, the slower the batch insert operations get. Inserting documents one by one shows a higher insert speed, which remains the same during the run of the application. I also verified that the effect is not caused by a stressed out balancer, by manually splitting the chunks, distributing them before inserting the documents and disabling the balancer. This yields a low insert speed right from the start. In my application, the documents are small, about 1 kb. The mongos instance and the shards show only little CPU load while inserting documents. Running multiple instances of the application in parallel allows the total insert speed to grow significantly, which suggests that the mongos threads spend most of their time waiting for something. |