Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-78301

Consider making bulkWrite base command size estimation on mongos more efficient

    XMLWordPrintableJSON

Details

    • Icon: Task Task
    • Resolution: Won't Do
    • Icon: Major - P3 Major - P3
    • None
    • None
    • None
    • Replication

    Description

      In SERVER-73536, we went with a naive implementation to estimate the size of a bulkWrite command (excluding its ops), where we just serialize a command object with fields copied over and placeholders added as needed, and take the size of that.
      Our rationale for this was that:

      • we only do it once up-front per bulkWrite command mongos receives
      • for most bulkWrite commands, we expect the ops field (which we skip serializing here) to take up the bulk of the command
      • this is strictly less expensive than serializing an actual sub-batch command, which is something we often do numerous times for a single incoming request on mongos that targets multiple shards.

      That said, for certain workloads (e.g. all writes are to a single shard so we won't split batches often, and/or there are large top-level fields on the command) this could prove costly.

      when we do performance testing, it may be worth reevaluating this. A smarter implementation could do math to try to estimate the size without actually serializing the data, similar to what we do for estimating the sizes of individual ops.

      Attachments

        Activity

          People

            backlog-server-repl Backlog - Replication Team
            kaitlin.mahar@mongodb.com Kaitlin Mahar
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: