[SERVER-81568] Batch Bulk inserts during the indexing collection scan Created: 29/Sep/23  Updated: 06/Feb/24

Status: Open
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Jordi Olivares Provencio Assignee: Backlog - Storage Execution Team
Resolution: Unresolved Votes: 0
Labels: former-storex-namer, storex-ranked
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: Text File batching.patch    
Issue Links:
Related
is related to SERVER-676 use multiple cores for index sort-phase Closed
Assigned Teams:
Storage Execution
Participants:

 Description   

As a result of investigating SERVER-676 we discovered that simply batching bulk inserts at the end revealed a significant increase in performance during index builds.

Right now inserts are performed 1-by-1, incurring the overhead of an entire WriteUnitOfWork for it. As we saw with BatchedDeletes, batching would alleviate the overhead.

The very rough patch attached here yielded an approximate 10% improvement in throughput.



 Comments   
Comment by Louis Williams [ 20/Nov/23 ]

I haven't looked at the code thoroughly, but we should just make sure this is resilient to partial batch failures.

Generated at Thu Feb 08 06:46:51 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.