[CDRIVER-1213] Pre-split bulk ops every 1000 documents Created: 27/Apr/16 Updated: 29/Aug/17 Resolved: 02/May/16 |
|
| Status: | Closed |
| Project: | C Driver |
| Component/s: | Bulk API, Performance |
| Affects Version/s: | None |
| Fix Version/s: | 1.4.0 |
| Type: | New Feature | Priority: | Major - P3 |
| Reporter: | A. Jesse Jiryu Davis | Assignee: | A. Jesse Jiryu Davis |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||
| Description |
|
The TestSmallDocBulkInsert task of the driver benchmarks reveals an inefficiency: as you call mongoc_bulk_operation_insert all docs are appended to the same buffer. In mongoc_bulk_operation_execute, those catenated documents are then split into batches of 1000, the batches have to be copied into "insert" command documents before they're sent. The simplest fix is to pre-split batches: when the current batch has 1000 documents, mongoc_bulk_operation_insert should start a new one. Same for updates and deletes. A bulk op can be constructed before the driver has selected a server to send it to, or before the driver has connected at all, so the server's maxWriteBatchSize isn't known. Assume 1000 for now. |
| Comments |
| Comment by Githook User [ 02/May/16 ] |
|
Author: {u'username': u'ajdavis', u'name': u'A. Jesse Jiryu Davis', u'email': u'jesse@mongodb.com'}Message: Speeds up mongoc_bulk_operation_execute when it has large numbers of |