[SERVER-22671] Implement serverStatus section with active migrations Created: 16/Feb/16 Updated: 05/Dec/16 Resolved: 24/Aug/16 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Sharding |
| Affects Version/s: | None |
| Fix Version/s: | 3.3.12 |
| Type: | Task | Priority: | Major - P3 |
| Reporter: | Kaloian Manassiev | Assignee: | Dianna Hohensee (Inactive) |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||||||||||
| Sprint: | Sharding 18 (08/05/16), Sharding 2016-08-29 | ||||||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||||||
| Description |
|
Implement an optional serverStatus section, available both on the config server and on the shards, which returns information about all the active chunk migrations. This section will be used by support and possibly by the balancer when it resumes from primary stepdown. On the shards, this serverStatus section should look like this (and should be placed under the optional sharding section):
On the config server, the section should be the union of all migrations across all shards. |
| Comments |
| Comment by Githook User [ 24/Aug/16 ] |
|
Author: {u'username': u'DiannaHohensee', u'name': u'Dianna Hohensee', u'email': u'dianna.hohensee@10gen.com'}Message: |
| Comment by James Wahlin [ 20/Jul/16 ] |
|
An aggregation document source for index access stats ($indexStats) was added under this commit: https://github.com/mongodb/mongo/commit/ae9df7fb11cf359686699aeb9539cb6dc35de675 |
| Comment by Andy Schwerin [ 20/Jul/16 ] |
|
If you have a non-fixed size amount of data to return, a cursor is a good idea. I have been encouraging people to write new operations that return cursors as aggregation stages. I think james.wahlin might have already done that once in the last few months. |
| Comment by Kaloian Manassiev [ 20/Jul/16 ] |
|
Since the contents of this section may theoretically be large for a large cluster with many ongoing migrations, there is a risk that we might exceed the 16MB command BSON response limit of the already large serverStatus return value. Because of this, I think it might be more appropriate to implement this as a command, which returns a cursor. schwerin/david.storch, do you know when it is appropriate to use cursor-based command instead of serverStatus section and whether there are any plans to make serverStatus return a cursor? |
| Comment by Kaloian Manassiev [ 18/Jul/16 ] |
|
It really doesn't make sense to collect this data so frequently. Plus, the changelog collection already contains entry for each migration as it occurs, so there is no need for customers to manually collect it and we have tools like the shardalyzer to visualize this information. I think we are clear here that this information has no place in FTDC. |
| Comment by Bruce Lucas (Inactive) [ 18/Jul/16 ] |
|
kaloian.manassiev, you are right, I overlooked the fact that none of the data appears to be numeric (assuming ShardId is not numeric - pls confirm). This means that it will be excluded from FTDC for the simple reason that FTDC can only handle numeric data. This means that to be used by support it will need to be collected manually by the customer while investigating a problem. Does this match your expectation about how it would be used? If collecting and retaining high-frequency (once per second) information related to chunk migrations is important, maybe there's a useful way to summarize this information as a metrics - for example, maybe number of chunk migrations currently in each of the states (init, catchup, critical section, delete)? |
| Comment by Kaloian Manassiev [ 18/Jul/16 ] |
|
bruce.lucas, none of the information exported by the active migrations section will contain metrics-related information. Does it make sense for it to go into FTDC and if yes, what usage scenarios do you envision? |
| Comment by Bruce Lucas (Inactive) [ 19/Apr/16 ] |
|
Since serverStatus is the basis of FTDC, we need to evaluate whether this should go into FTDC. That will depend on likely size, how dynamic the schema is, etc., which will impact FTDC data rate. |