[SERVER-20156] Disk iops of balancer 10 times larger than simple insertions Created: 27/Aug/15  Updated: 28/Aug/15  Resolved: 27/Aug/15

Status: Closed
Project: Core Server
Component/s: Admin, Sharding
Affects Version/s: 3.0.5
Fix Version/s: None

Type: Improvement Priority: Minor - P4
Reporter: patrick wong Assignee: Unassigned
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Participants:

 Description   

We want to know if balancer is efficiency enough to balance data when we add a new empty shard to a cluster

The detail of the test

mongodb version : 3.0.5
OS : centos 6

We try to use tag aware shard tag to pre-allocate the chunk in shard2 and try to insert around 10 millions of records to it.

– The disk iops is around 10 during the insertion

Then we remove the shard tag in shard2 and add it back to shard3 and let the balancer to move the 10 millions of records from shard2 to shard3

– The disk iops is around 100 and more. And the whole machine is slow down

After the test, we are worrying that the entire cluster performance may drop if we try to add a empty shard to a cluster because a certain portion system resource is used to balance the data

May I know if the improvement of balancer performance would be part of the roadmap ?



 Comments   
Comment by Ramon Fernandez Marina [ 28/Aug/15 ]

The performance of the balancer is determined by the performance of your storage layer and the load on your servers and network. If a third of your data needs to be moved out the speed limitation will most likely come from storage IOPS and network bandwidth. Tag aware sharding may help in some cases.

That being said, you may be interested in looking at SERVER-9120 and its related tickets.

Regards,
Ramón.

Comment by patrick wong [ 28/Aug/15 ]

Thanks for your reply.

However, I'm not request how to set a balancer window.

For distributed computing, if it can't well distribute data among the node, the load of cluster can't share between different machines.

If the distribution process is slow, adding machine can't help the performance immediately

For example, I have cluster of 3 shards and insertions is done on 24 * 7 basis

After 1 year, I need to add a empty shard to improve the performance

1/4 year data need to move the new shard

If the balancer migration speed is slower 3 or 4 times than normal insertions, I may need 1 year to balance data.

That may cause below issue

1. I can't get immediate performance gain even I spend a new machine
2. The high disk io may cause existing cluster performance degrade. Balancer window cannot help much if it is a 24*7 systems and the chunk migration lasts for 1 year

As a result, I want to know if there is plan to improve the balancer performance and really hope it will be a valuable enhancement in coming releases

Comment by Ramon Fernandez Marina [ 27/Aug/15 ]

Thanks for your report patrickwong@wisers.com. The balancer needs to read data from disk to move it off to another shard, so depending on your configuration the disk may become the bottleneck here. You may want to investigate setting a balancer window so migrations occur only times where I/O load will not impact production.

Please note that the SERVER project is for reporting bugs or feature suggestions for the MongoDB server. For MongoDB-related support discussion please post on the mongodb-user group or Stack Overflow with the mongodb tag, where your question will reach a larger audience. A question like this involving more discussion would be best posted on the mongodb-user group. See also our Technical Support page for additional support resources.

Regards,
Ramón.

Generated at Thu Feb 08 03:53:20 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.