[SERVER-19436] Batch writes when building an index Created: 16/Jul/15 Updated: 06/Dec/22 Resolved: 22/May/18 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Index Maintenance, Storage |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Minor - P4 |
| Reporter: | Igor Canadi | Assignee: | Backlog - Storage Execution Team |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Assigned Teams: |
Storage Execution
|
| Participants: |
| Description |
|
We're seeing some performance issues with MongoDB when there are a lot of indexes being built in the background. Currently, there is an index commit after every insert: https://github.com/mongodb/mongo/blob/master/src/mongo/db/catalog/index_create.cpp#L266 The performance is likely to be much better if we could batch a couple of writes together. Is this something you would be interested in? I might be able to draft the patch, too. |
| Comments |
| Comment by Eric Milkie [ 22/May/18 ] |
|
We support bulk loading index data now. |
| Comment by Igor Canadi [ 17/Jul/15 ] |
|
Here's something I hacked up quickly: https://github.com/mongodb-partners/mongo/tree/hack-batched-index-writes (check out the latest commit). It passes jsCore tests. It does not handle WCEs correctly (with batched builds, we need to rewind the table scan somehow). However, it should be good enough to test the throughput gains. |
| Comment by Geert Bosch [ 16/Jul/15 ] |
|
Hi Igor, yes, we are coming to the same conclusion and are planning to spend in this area early next week. We'd be interested in your approach. Even a very rough concept patch would be welcome, as our first step will be validating the approach by testing throughput. |