[SERVER-44839] Frequent schema changes in mongos ftdc metrics limits retention period Created: 26/Nov/19 Updated: 08/Jan/24 Resolved: 25/Jan/20 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Diagnostics, Sharding |
| Affects Version/s: | 4.2.0 |
| Fix Version/s: | 4.2.4, 4.3.3 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Bruce Lucas (Inactive) | Assignee: | Benjamin Caimano (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||||||||||||||||||
| Issue Links: |
|
||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||
| Operating System: | ALL | ||||||||||||||||||||
| Backport Requested: |
v4.2
|
||||||||||||||||||||
| Sprint: | Service Arch 2020-01-13, Service Arch 2020-01-27 | ||||||||||||||||||||
| Participants: | |||||||||||||||||||||
| Description |
|
The schema for metrics under connPoolStats.connectionsInUsePerPool.NetworkInterfaceTL-TaskExecutorPool-0 changes frequently as it appears that hosts in that tree are emitted in an order that changes every couple of samples, and hosts come and go frequently from the metrics. These frequent schema changes greatly reduce compression efficiency, limiting the retention period - as short as 15 hours in one case. The schema for this subtree should be monotonic and consistent from one sample to the next. |
| Comments |
| Comment by Githook User [ 02/Mar/20 ] | ||
|
Author: {'name': 'Ben Caimano', 'email': 'ben.caimano@10gen.com'}Message: (cherry picked from commit d86e7c464d276fbd40570a4a2a7144fe133bd780) | ||
| Comment by Githook User [ 25/Jan/20 ] | ||
|
Author: {'email': 'ben.caimano@10gen.com', 'name': 'Ben Caimano'}Message: | ||
| Comment by Benjamin Caimano (Inactive) [ 14/Jan/20 ] | ||
|
I've filed a separate ticket ( | ||
| Comment by Bruce Lucas (Inactive) [ 13/Jan/20 ] | ||
|
Repro script attached. | ||
| Comment by Benjamin Caimano (Inactive) [ 13/Jan/20 ] | ||
|
Liking the last ticket that touched the RSM/ConnPool FTDC conenction | ||
| Comment by Bruce Lucas (Inactive) [ 07/Jan/20 ] | ||
|
Git bisect identifies the following commit as when the problem started:
Since this isn't directly related to connection pool stats or ftdc data, I imagine the change in behavior of the connection pool stats is an unfortunate side effect of this change, and some compensating change in collecting connection pool stats for ftdc will be needed make them stable and monotonic again. | ||
| Comment by Bruce Lucas (Inactive) [ 07/Jan/20 ] | ||
|
Testing confirms that this is a 4.2 regression, introduced between 4.2.0-rc4 and 4.2.0-rc5. |