[SERVER-47641] Limit size of serverStatus metrics for the range deleter Created: 17/Apr/20 Updated: 29/Oct/23 Resolved: 22/Apr/20 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Sharding |
| Affects Version/s: | None |
| Fix Version/s: | 4.4.0-rc3, 4.7.0 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Matthew Saltz (Inactive) | Assignee: | Gregory Noma |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||
| Operating System: | ALL | ||||||||||||||||||||
| Backport Requested: |
v4.4
|
||||||||||||||||||||
| Sprint: | Sharding 2020-05-04 | ||||||||||||||||||||
| Participants: | |||||||||||||||||||||
| Linked BF Score: | 95 | ||||||||||||||||||||
| Description |
|
|
| Comments |
| Comment by Githook User [ 22/Apr/20 ] |
|
Author: {'name': 'Gregory Noma', 'email': 'gregory.noma@gmail.com', 'username': 'gregorynoma'}Message: This reverts commit 62d9485657717bf61fbb870cb3d09b52b1a614dd. |
| Comment by Githook User [ 22/Apr/20 ] |
|
Author: {'name': 'Gregory Noma', 'email': 'gregory.noma@gmail.com', 'username': 'gregorynoma'}Message: (cherry picked from commit fa945325938ada67a088e7dbe951404d092e8771) |
| Comment by Githook User [ 22/Apr/20 ] |
|
Author: {'name': 'Gregory Noma', 'email': 'gregory.noma@gmail.com', 'username': 'gregorynoma'}Message: |
| Comment by Githook User [ 22/Apr/20 ] |
|
Author: {'name': 'Benety Goh', 'email': 'benety@mongodb.com', 'username': 'benety'}Message: |
| Comment by Githook User [ 21/Apr/20 ] |
|
Author: {'name': 'Gregory Noma', 'email': 'gregory.noma@gmail.com', 'username': 'gregorynoma'}Message: |
| Comment by Bruce Lucas (Inactive) [ 21/Apr/20 ] |
|
Thanks, that definitely sounds useful. |
| Comment by Gregory Noma [ 21/Apr/20 ] |
|
bruce.lucas Per your insight, we've decided to go with reporting a single number representing the total number of range deletion tasks, rather than per collection. |
| Comment by Bruce Lucas (Inactive) [ 20/Apr/20 ] |
|
Specifically it means addition or deletion of a key, where a key is a path through the document from the root that leads to a numeric value. It's very expensive because it requires starting a new chunk; each chunk is a reference document (big, somewhat compressible) with a bunch of delta-code arrays of values for each key in the reference document (very highly compressible). The number of them is also a consideration; thousands or tens of thousands of them would inflate ftdc and also reduce retention. Isn't this information also obtainable from the logs, with a little analysis? From an FTDC perspective I think it would be best to omit this information; that could be done with a parameter to serverStatus. I think it's also iffy to have this in serverStatus in general - how for example does a huge serverStatus impact Cloud monitoring? |
| Comment by Matthew Saltz (Inactive) [ 17/Apr/20 ] |
|
Sure, it's a BSONArray of the form [ "mynamespace" : <number of range deletion tasks pending for mynamespace>, ..., "mylastnamespace" : <number of range deletion tasks ...>] Also, I'm not sure what you mean by "it's also a question of schema changes in serverStatus which are very expensive from an ftdc perspective". By schema change, do you mean the addition of a new field and/or modification of the format of a given field? In what way is it expensive? (I'm not super familiar with the process for obtaining FTDC data so forgive me if this is a basic question.) |
| Comment by Bruce Lucas (Inactive) [ 17/Apr/20 ] |
|
It's not just a question of number of keys, it's also a question of schema changes in serverStatus which are very expensive from an ftdc perspective, depending on the rate at which they occur. Can you describe or point to a description of the content of this field? As a general rule, it's not a good idea to put per-namespace info in serverStatus. |
| Comment by Matthew Saltz (Inactive) [ 17/Apr/20 ] |
|
Two possible options would be to limit it a certain BSONObj size, or limit to a max number of namespaces. When the limit is exceeded, we could instead report the total number of range deletion tasks across all namespaces rather than reporting that number per namespace. bruce.lucas would you have a preference? Do you think there's some number of namespaces after which the information simply becomes unwieldy? Note that the same (and in fact, more detailed) information will also be visible in the config.rangeDeletions collection on each shard. |