Uploaded image for project: 'MongoDB Database Tools'
  1. MongoDB Database Tools
  2. TOOLS-2875

Limit the BufferedBulkInserter's batch size by bytes

    • Type: Icon: Task Task
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • 100.5.2
    • Affects Version/s: None
    • Component/s: None
    • None

      As part of TOOLS-1956 we removed the byte limit on batch sizes in the BufferedBulkInserter. (See mtc and tools.)

      This means each batch of the BufferedBulkInserter can hold up to ~16 GB of data before it gets flushed. The theoretical maximum of data that can be stored in BufferedBulkInserter's in mongorestore is ~16 GB * NumParallelCollections * NumInsertionWorkers. This is ~64 GB by default.

      This can have a severe impact on performance, even for average document sizes of 1-2 MB.

      We should limit batches to 48MB. The BufferedBulkInserter will flush its batch whenever the document count reaches the batchSize OR the total size of documents in the batch reaches 48MB.

      The go driver splits batches over 48MB so there is no benefit to having batches larger than this.

      Additionally, this will provide a limit for the size of the sync.Pool in TOOLS-1856.


            tim.fogarty@mongodb.com Tim Fogarty
            tim.fogarty@mongodb.com Tim Fogarty
            Evgeni Dobranov
            1 Vote for this issue
            8 Start watching this issue