Uploaded image for project: 'Documentation'
  1. Documentation
  2. DOCS-16159

Investigate changes in SERVER-72146: Make chunk migrations metrics more accessible from Atlas

      Original Downstream Change Summary

      After this commits in all LTS versions of mongo the following sharding statistics are now available through serverStatus:

      • countDocsClonedOnCatchUpOnRecipient: the number of documents cloned during the catch up phase of the migration
      • countBytesClonedOnCatchUpOnRecipient: the number of bytes cloned during the catch up phase of the migration
      • countBytesClonedOnRecipient: the number of bytes cloned by the recipient of a migration
      • countDonorMoveChunkCommitted: the total number of migrations committed by the node
      • countDonorMoveChunkAborted: the number of migrations aborted in the node
      • totalDonorMoveChunkTimeMillis: the total amount of time a migration took from beginning to end
      • totalRecipientCriticalSectionTimeMillis: the amount of time in milliseconds the recipient of a migration spent holding the critical section

        Description of Linked Ticket

        Often, when investigating HELP tickets related to balancing, we need to access and combine data from FTDC, logs and configdump to figure some basic metrics such as:

      • Migration throughput (how fast is this shard cloning data)
      • Range deleter throughout (how fast is this shard executing its range deletions)
      • Number of orphans documents (how many orphans documents are waiting to be deleted)

      The following statistics should be available on serverStatus under the shardingStatistics group:

      • countDocsClonedOnCatchUpOnRecipient: the number of documents cloned during the catch up phase of the migration
      • countBytesClonedOnCatchUpOnRecipient: the number of bytes cloned during the catch up phase of the migration
      • countBytesClonedOnRecipient: the number of bytes cloned by the recipient of a migration
      • countDonorMoveChunkCommitted: the total number of migrations committed by the node
      • countDonorMoveChunkAborted: the number of migrations aborted in the node
      • totalDonorMoveChunkTimeMillis: the total amount of time a migration took from beginning to end
      • totalRecipientCriticalSectionTimeMillis: the amount of time in milliseconds the recipient of a migration spent holding the critical section

            Assignee:
            jason.price@mongodb.com Jason Price
            Reporter:
            backlog-server-pm Backlog - Core Eng Program Management Team
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved:
              47 weeks, 1 day ago