[SERVER-41215] Gather and expose "chunk-level" statistics Created: 17/May/19 Updated: 05/May/23 |
|
| Status: | Open |
| Project: | Core Server |
| Component/s: | Sharding |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | New Feature | Priority: | Major - P3 |
| Reporter: | Shakir Sadikali | Assignee: | Matt Panton |
| Resolution: | Unresolved | Votes: | 4 |
| Labels: | sharding-common-backlog | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Assigned Teams: |
Sharding EMEA
|
||||||||
| Participants: | |||||||||
| Case: | (copied to CRM) | ||||||||
| Description |
|
I'd like to propose (what I believe to be) a light-weight solution for tracking chunk utilization. The mongos already tracks where queries should be routed for DML and find commands. Could we track insert, updates, deletes, finds, commands, no-ops against the chunk ranges they are sent to? And, possibly the aggregate total # of documents returned against those chunks? If we could track these numbers in memory on the mongos we could then get a sense of what chunks are "hot". In the case of a balanced-by-data-size and balanced-by-number-of-chunks system, if we observe an imbalance in load as defined by query volume, we could then easily identify those chunks that are hot and take surgical action to move some or all of them manually. |
| Comments |
| Comment by Connie Chen [ 18/Jan/23 ] |
|
Taking out of PM-631 and placing in "Needs Scheduling" since we have closed PM-631 as won't do |
| Comment by Kaloian Manassiev [ 17/May/19 ] |
|
Starting with mongodb 4.2, these statistics are tracked on the mongod, which gives a more accurate picture of the utilization. We are trying to move away from mongos tracking anything, because of their ephemeral nature (they are stateless so they can come and go). |