Loading...

XML

Word

Printable

JSON

Type: Task
Resolution: Won't Do
Priority: Major - P3
Fix Version/s: None
Affects Version/s: None
Component/s: None
Labels:
- milestone-2

Assigned Teams:

Replication
Confidence Status:
None
Work Order:
3

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

In ~~SERVER-73536~~, we went with a naive implementation to estimate the size of a bulkWrite command (excluding its ops), where we just serialize a command object with fields copied over and placeholders added as needed, and take the size of that.
Our rationale for this was that:

we only do it once up-front per bulkWrite command mongos receives
for most bulkWrite commands, we expect the ops field (which we skip serializing here) to take up the bulk of the command
this is strictly less expensive than serializing an actual sub-batch command, which is something we often do numerous times for a single incoming request on mongos that targets multiple shards.

That said, for certain workloads (e.g. all writes are to a single shard so we won't split batches often, and/or there are large top-level fields on the command) this could prove costly.

when we do performance testing, it may be worth reevaluating this. A smarter implementation could do math to try to estimate the size without actually serializing the data, similar to what we do for estimating the sizes of individual ops.

related to

SERVER-81086 Complete TODO listed in SERVER-78301

Closed

Assignee:: [DO NOT USE] Backlog - Replication Team
Reporter:: Kaitlin Mahar
Participants:: [DO NOT USE] Backlog - Replication Team, Kaitlin Mahar, Sean Zimmerman
Votes:: 0 Vote for this issue
Watchers:: 3 Start watching this issue

Created:: Jun 21 2023 04:15:21 PM UTC
Updated:: Sep 14 2023 07:34:31 PM UTC
Resolved:: Sep 14 2023 05:39:25 PM UTC

Details

Description

Attachments

Issue Links

Activity

People

Dates