[SERVER-39683] shard request to split a same chunk at the same time. Created: 20/Feb/19 Updated: 27/Feb/19 Resolved: 25/Feb/19 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Sharding |
| Affects Version/s: | 3.4.3 |
| Fix Version/s: | None |
| Type: | Question | Priority: | Major - P3 |
| Reporter: | JackWang [X] | Assignee: | Eric Sedor |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||||||||||||||
| Issue Links: |
|
||||||||||||||||
| Participants: | |||||||||||||||||
| Description |
|
Hello, in the environment of my production server, there is occasional increase in machine load, the normal time is 0-1, the abnormal time is 5-10, the shard node cpu is very high when viewing, the machine always prints some information when viewing the log. When the log is not printed, the machine load drops and resumes. The "Finding the split vector for ctu.mobileToken" will always operate, how to avoid it, during this time, the request volume has not increased significantly, io utils 40%. I am looking forward to your reply, thank you!
|
| Comments |
| Comment by JackWang [X] [ 27/Feb/19 ] |
|
Hello, I am here for help. This is related to my work. My production line has accumulated 500 million data. Now the machine resources are alarmed. When I perform the remove operation, there is very little data (such as 10,000), and the machine loads the alarm. A lot of slow queries; I want to know how to quickly delete some of the data in the shard collection, looking forward to your reply ------------------ Original ------------------ Subject: [MongoDB-JIRA] ( Eric Sedor 于 19-2-26 下午8:56 编辑了 To clarify, we are suggesting adding 2 mongos to bring the total to 5, and this is considered a mitigation but not a solution. You would need to direct write traffic at all 5 via a connection string that included all of them. The goal of this change is to ensure that the majority of mongos-driven split attempts coincide with a chunk reaching its maximum size. Currently this is most accurately attained with 5 mongos receiving evenly distributed writes. For further discussion about how the system works, please post on the mongodb-user group or Stack Overflow with the mongodb tag. A question like this involving more discussion would be best posted on the mongodb-user group. 原值 (作者: eric.sedor): You would need to direct write traffic at all 5 via a connection string that included all of them. The goal of this change is to ensure that the majority of mongos-driven split attempts to coincide with a chunk reaching its maximum size. Currently this is most accurately attained with 5 mongos receiving evenly distributed writes. For further discussion about how the system works, please post on the mongodb-user group or Stack Overflow with the mongodb tag. A question like this involving more discussion would be best posted on the mongodb-user group. ---------------------- |
| Comment by JackWang [X] [ 27/Feb/19 ] |
|
感谢你的耐心讲解,但是我的生产环境依然是偶尔机器负载高,并且那个时刻,shard节点一直打印“find the split vector ”,我并没有具体的解决措施,我的环境有5亿数据,我想快速删除一部分,但是没有好的方案。我现在很是苦恼,希望得到你的帮助 ------------------ Original ------------------ Subject: [MongoDB-JIRA] ( Eric Sedor 在 To clarify, we are suggesting adding 2 mongos to bring the total to 5, and this is considered a mitigation but not a solution. You would need to direct write traffic at all 5 via a connection string that included all of them. The goal of this change is to ensure that the majority of mongos-driven split attempts to coincide with a chunk reaching its maximum size. Currently this is most accurately attained with 5 mongos receiving evenly distributed writes. For further discussion about how the system works, please post on the mongodb-user group or Stack Overflow with the mongodb tag. A question like this involving more discussion would be best posted on the mongodb-user group. ---------------------- |
| Comment by Eric Sedor [ 26/Feb/19 ] |
|
To clarify, we are suggesting adding 2 mongos to bring the total to 5, and this is considered a mitigation but not a solution. You would need to direct write traffic at all 5 via a connection string that included all of them. The goal of this change is to ensure that the majority of mongos-driven split attempts coincide with a chunk reaching its maximum size. Currently this is most accurately attained with 5 mongos receiving evenly distributed writes. For further discussion about how the system works, please post on the mongodb-user group or Stack Overflow with the mongodb tag. A question like this involving more discussion would be best posted on the mongodb-user group.
|
| Comment by JackWang [X] [ 26/Feb/19 ] |
|
Thank you for your reply, but I still can't understand why adding 5 mongos can solve this problem. If I increase it from 5 mongos to 5, but there are no application connection operations, can I do this? Still 5 mongos must have a connection. |
| Comment by Eric Sedor [ 25/Feb/19 ] |
|
Hi JackWang@180721, thank you for your patience. We believe this is expected behavior resulting from how mongos nodes estimate when a chunk split needs to occur. Our current efforts around In the meantime, you may be able to work around the impact of this issue by increasing the number of your mongos routers to 5 to match assumptions made by the mongos autoSplit algorithm (which is influenced by a splitTestFactor of 5). |
| Comment by JackWang [X] [ 21/Feb/19 ] |
|
3 mongos 3 config and 3 shard ; |
| Comment by Eric Sedor [ 21/Feb/19 ] |
|
Thanks for writing in. We are investigating and will get back to you with any questions we have. For now we did have one: Can you let us know how many mongos routers are in this deployment? |