[CDRIVER-1213] Pre-split bulk ops every 1000 documents Created: 27/Apr/16  Updated: 29/Aug/17  Resolved: 02/May/16

Status: Closed
Project: C Driver
Component/s: Bulk API, Performance
Affects Version/s: None
Fix Version/s: 1.4.0

Type: New Feature Priority: Major - P3
Reporter: A. Jesse Jiryu Davis Assignee: A. Jesse Jiryu Davis
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to CDRIVER-2259 Respect maxWriteBatchSize Closed
is related to CDRIVER-1212 C Driver Performance Benchmarking Closed

 Description   

The TestSmallDocBulkInsert task of the driver benchmarks reveals an inefficiency: as you call mongoc_bulk_operation_insert all docs are appended to the same buffer. In mongoc_bulk_operation_execute, those catenated documents are then split into batches of 1000, the batches have to be copied into "insert" command documents before they're sent.

The simplest fix is to pre-split batches: when the current batch has 1000 documents, mongoc_bulk_operation_insert should start a new one.

Same for updates and deletes.

A bulk op can be constructed before the driver has selected a server to send it to, or before the driver has connected at all, so the server's maxWriteBatchSize isn't known. Assume 1000 for now.



 Comments   
Comment by Githook User [ 02/May/16 ]

Author:

{u'username': u'ajdavis', u'name': u'A. Jesse Jiryu Davis', u'email': u'jesse@mongodb.com'}

Message: CDRIVER-1213 pre-split bulk ops every 1000 docs

Speeds up mongoc_bulk_operation_execute when it has large numbers of
small documents, by avoiding the need to split and copy batches of
documents.
Branch: master
https://github.com/mongodb/mongo-c-driver/commit/51e966afe10d54318ba727359db100f78cca98c5

Generated at Wed Feb 07 21:11:54 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.