Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-37810

Optimise balancer performance with zone sharding

    XMLWordPrintable

    Details

    • Case:

      Description

      Reproduced in MongoDB 3.4.16 and 4.0.3.

      With a considerable number of chunks (1+ million), the balancer is observed to spend a large amount of time checking each chunk for belonging to a tag. This can lead to a situation where a balancer round spends most of its time finding a candidate chunk (e.g. one minute) rather than migrating a chunk. This can have a significant impact on the overall cluster balancing performance.

      Below is the a repro where the balancer spends 90% of its time finding a candidate chunk, and only 10% of its time moving the chunk.

      Off-CPU profiling suggests that the balancer thread is CPU-bound. Attached a 60-second flame graph of the 3.4.16 CSRS primary process. The CSRS primary is only balancing the cluster at that time.

      Most CPU time is consumed in BSONObj:woCompare().

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              backlog-server-sharding Backlog - Sharding Team
              Reporter:
              josef.ahmad Josef Ahmad
              Participants:
              Votes:
              4 Vote for this issue
              Watchers:
              30 Start watching this issue

                Dates

                Created:
                Updated: