[SERVER-19436] Batch writes when building an index Created: 16/Jul/15  Updated: 06/Dec/22  Resolved: 22/May/18

Status: Closed
Project: Core Server
Component/s: Index Maintenance, Storage
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Minor - P4
Reporter: Igor Canadi Assignee: Backlog - Storage Execution Team
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Assigned Teams:
Storage Execution
Participants:

 Description   

We're seeing some performance issues with MongoDB when there are a lot of indexes being built in the background. Currently, there is an index commit after every insert: https://github.com/mongodb/mongo/blob/master/src/mongo/db/catalog/index_create.cpp#L266

The performance is likely to be much better if we could batch a couple of writes together.

Is this something you would be interested in? I might be able to draft the patch, too.



 Comments   
Comment by Eric Milkie [ 22/May/18 ]

We support bulk loading index data now.

Comment by Igor Canadi [ 17/Jul/15 ]

Here's something I hacked up quickly: https://github.com/mongodb-partners/mongo/tree/hack-batched-index-writes (check out the latest commit). It passes jsCore tests.

It does not handle WCEs correctly (with batched builds, we need to rewind the table scan somehow). However, it should be good enough to test the throughput gains.

Comment by Geert Bosch [ 16/Jul/15 ]

Hi Igor, yes, we are coming to the same conclusion and are planning to spend in this area early next week. We'd be interested in your approach. Even a very rough concept patch would be welcome, as our first step will be validating the approach by testing throughput.

Generated at Thu Feb 08 03:50:58 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.