[SERVER-22671] Implement serverStatus section with active migrations Created: 16/Feb/16  Updated: 05/Dec/16  Resolved: 24/Aug/16

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: 3.3.12

Type: Task Priority: Major - P3
Reporter: Kaloian Manassiev Assignee: Dianna Hohensee (Inactive)
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Documented
is documented by DOCS-9534 Docs for SERVER-22671: Implement serv... Closed
Gantt Dependency
has to be done before SERVER-25334 Add SourceDestinationManager::getMigr... Closed
Related
related to SERVER-18940 Optimise sharded aggregations that ar... Closed
related to SERVER-26573 Poor compression of diagnostic data d... Closed
Backwards Compatibility: Fully Compatible
Sprint: Sharding 18 (08/05/16), Sharding 2016-08-29
Participants:

 Description   

Implement an optional serverStatus section, available both on the config server and on the shards, which returns information about all the active chunk migrations. This section will be used by support and possibly by the balancer when it resumes from primary stepdown.

On the shards, this serverStatus section should look like this (and should be placed under the optional sharding section):

sharding: {
	migrations: [
		{ source: ShardId,
                  sourceHost: Host:Port,
                  destination: ShardId,
                  destinationHost: Host:Port,
                  chunk: { min: <MinKey>, max: <MaxKey> },
                  phase: String, one of <INIT, CATCHUP, CRITICAL SECTION>,
		},
		...
	]
}

On the config server, the section should be the union of all migrations across all shards.



 Comments   
Comment by Githook User [ 24/Aug/16 ]

Author:

{u'username': u'DiannaHohensee', u'name': u'Dianna Hohensee', u'email': u'dianna.hohensee@10gen.com'}

Message: SERVER-22671 adding migration status to serverStatus' sharding section
Branch: master
https://github.com/mongodb/mongo/commit/3de973e1fd98473fbf1605e2d6039214aa15b2a4

Comment by James Wahlin [ 20/Jul/16 ]

An aggregation document source for index access stats ($indexStats) was added under this commit: https://github.com/mongodb/mongo/commit/ae9df7fb11cf359686699aeb9539cb6dc35de675

Comment by Andy Schwerin [ 20/Jul/16 ]

If you have a non-fixed size amount of data to return, a cursor is a good idea. I have been encouraging people to write new operations that return cursors as aggregation stages. I think james.wahlin might have already done that once in the last few months.

Comment by Kaloian Manassiev [ 20/Jul/16 ]

Since the contents of this section may theoretically be large for a large cluster with many ongoing migrations, there is a risk that we might exceed the 16MB command BSON response limit of the already large serverStatus return value. Because of this, I think it might be more appropriate to implement this as a command, which returns a cursor.

schwerin/david.storch, do you know when it is appropriate to use cursor-based command instead of serverStatus section and whether there are any plans to make serverStatus return a cursor?

Comment by Kaloian Manassiev [ 18/Jul/16 ]

It really doesn't make sense to collect this data so frequently. Plus, the changelog collection already contains entry for each migration as it occurs, so there is no need for customers to manually collect it and we have tools like the shardalyzer to visualize this information.

I think we are clear here that this information has no place in FTDC.

Comment by Bruce Lucas (Inactive) [ 18/Jul/16 ]

kaloian.manassiev, you are right, I overlooked the fact that none of the data appears to be numeric (assuming ShardId is not numeric - pls confirm). This means that it will be excluded from FTDC for the simple reason that FTDC can only handle numeric data.

This means that to be used by support it will need to be collected manually by the customer while investigating a problem. Does this match your expectation about how it would be used? If collecting and retaining high-frequency (once per second) information related to chunk migrations is important, maybe there's a useful way to summarize this information as a metrics - for example, maybe number of chunk migrations currently in each of the states (init, catchup, critical section, delete)?

Comment by Kaloian Manassiev [ 18/Jul/16 ]

bruce.lucas, none of the information exported by the active migrations section will contain metrics-related information. Does it make sense for it to go into FTDC and if yes, what usage scenarios do you envision?

Comment by Bruce Lucas (Inactive) [ 19/Apr/16 ]

Since serverStatus is the basis of FTDC, we need to evaluate whether this should go into FTDC. That will depend on likely size, how dynamic the schema is, etc., which will impact FTDC data rate.

Generated at Thu Feb 08 04:01:06 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.