[DOCS-13591] Investigate changes in SERVER-46396: Add metrics to track number of operations blocked behind a catalog cache refresh Created: 14/Apr/20  Updated: 13/Nov/23  Due: 17/Apr/20  Resolved: 15/May/20

Status: Closed
Project: Documentation
Component/s: manual, Server
Affects Version/s: None
Fix Version/s: 4.4.0-rc2, 4.2.7, 4.7.0, Server_Docs_20231030, Server_Docs_20231106, Server_Docs_20231105, Server_Docs_20231113

Type: Task Priority: Major - P3
Reporter: Backlog - Core Eng Program Management Team Assignee: Andrew Feierabend (Inactive)
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Documented
documents SERVER-46396 Add metrics to track number of operat... Closed
Participants:
Days since reply: 3 years, 37 weeks, 1 day ago
Epic Link: DOCS: 4.4 Server Release Work

 Description   

Description

Downstream Change Summary

Added new metric under shardingStatistics.catalogCache that will only be incremented on mongos:

https://docs.mongodb.com/manual/reference/command/serverStatus/#shardingstatistics

It will be the field OperationsBlockedByRefresh, which will be a container with six counts:

OperationsBlockedByRefresh.countAllOperations
OperationsBlockedByRefresh.countInserts
OperationsBlockedByRefresh.countQueries
OperationsBlockedByRefresh.countUpdates
OperationsBlockedByRefresh.countDeletes
OperationsBlockedByRefresh.countCommands

This metric will track any operations that flow through the mongos/router and had to block on a catalog cache refresh during the operation.

This metric will be backported back to 4.2.

Top-level view:

"shardingStatistics" : {
   "catalogCache" : {
      "numDatabaseEntries" : NumberLong(<num>),
      "numCollectionEntries" : NumberLong(<num>),
      "countStaleConfigErrors" : NumberLong(<num>),
      "totalRefreshWaitTimeMicros" : NumberLong(<num>),
      "numActiveIncrementalRefreshes" : NumberLong(<num>),
      "countIncrementalRefreshesStarted" : NumberLong(<num>),
      "numActiveFullRefreshes" : NumberLong(<num>),
      "countFullRefreshesStarted" : NumberLong(<num>),
      "countFailedRefreshes" : NumberLong(<num>),
      "operationsBlockedByRefresh" : {
      	"countAllOperations" : NumberLong(<num>),
      	"countInserts" : NumberLong(<num>),
      	"countQueries" : NumberLong(<num>),
      	"countUpdates" : NumberLong(<num>),
      	"countDeletes" : NumberLong(<num>),
      	"countCommands : NumberLong(<num>),
      }
   }
},

Description of Linked Ticket

Text from scope document:

"Count and percentage of number of queries that are no longer blocked, but would have been blocked on a refresh"

Scope of changes

Impact to Other Docs

MVP (Work and Date)

Resources (Scope or Design Docs, Invision, etc.)



 Comments   
Comment by Githook User [ 26/May/20 ]

Author:

{'name': 'Andrew Feierabend', 'email': 'andrew.feierabend@mongodb.com', 'username': 'andf-mongodb'}

Message: DOCS-13591 add new operationsBlockedByRefresh metric
Branch: v4.2
https://github.com/mongodb/docs/commit/44276814d083b82be961097bd05881b8c75a3263

Comment by Githook User [ 15/May/20 ]

Author:

{'name': 'Andrew Feierabend', 'email': 'andrew.feierabend@mongodb.com', 'username': 'andf-mongodb'}

Message: DOCS-13591 add new operationsBlockedByRefresh metric
Branch: v4.2.7
https://github.com/mongodb/docs/commit/7fc2607d2449f98ae6f7306a92393636cbd3dba9

Comment by Githook User [ 15/May/20 ]

Author:

{'name': 'Andrew Feierabend', 'email': 'andrew.feierabend@mongodb.com', 'username': 'andf-mongodb'}

Message: DOCS-13591 add new operationsBlockedByRefresh metric
Branch: master
https://github.com/mongodb/docs/commit/30938cda5110ddaf5c97addbbc5caa955321370a

Comment by Blake Oler [ 27/Apr/20 ]

Unblocked, as the above ticket is complete.

Comment by Blake Oler [ 23/Apr/20 ]

I'd like to hold off on pushing through this change until I complete SERVER-47738. As of current, the metric does exist on both mongos and mongod, but I'd like to pull it off of mongod completely. I talked to andrew.feierabend about this offline, and he's okay with waiting until it goes through so that no documentation work is duplicated.

Generated at Thu Feb 08 08:08:12 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.