[SERVER-31688] W SHARDING [conn161595] can't accept new chunks because there are still 1 deletes from previous migration Created: 24/Oct/17  Updated: 27/Oct/23  Resolved: 25/Oct/17

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Question Priority: Major - P3
Reporter: vegaoqiang Assignee: Kaloian Manassiev
Resolution: Gone away Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Centos7.2


Issue Links:
Related
is related to SERVER-27009 Replication initial sync creates curs... Closed
Participants:

 Description   

in mongo cluster have 8 shard, As the "chunk" distribution is uneven, so i check mongos log or mongod log, found the following messages:

2017-10-24T14:36:54.135+0800 W SHARDING [conn161595] can't accept new chunks because there are still 1 deletes from previous migration
2017-10-24T14:36:54.318+0800 I SHARDING [conn161595] Refreshing chunks for collection appdata.App based on version 2133|4||5731b54b1f4029ea4c489504
2017-10-24T14:36:54.321+0800 I SHARDING [CatalogCacheLoader-16123] Refresh for collection appdata.App took 2 ms and found version 2133|4||5731b54b1f4029ea4c489504
2017-10-24T14:36:54.321+0800 W SHARDING [conn161595] can't accept new chunks because there are still 1 deletes from previous migration
2017-10-24T14:36:54.328+0800 I SHARDING [conn161599] Refreshing chunks for collection appdata.AppTrend based on version 3933|22||5731b7bc1f4029ea4c48954e
2017-10-24T14:36:54.330+0800 I SHARDING [CatalogCacheLoader-16125] Refresh for collection appdata.AppTrend took 2 ms and found version 3933|22||5731b7bc1f4029ea4c48954e
2017-10-24T14:36:54.332+0800 W SHARDING [conn161599] can't accept new chunks because there are still 1 deletes from previous migration
2017-10-24T14:36:54.386+0800 I SHARDING [conn161597] Refreshing chunks for collection appdata.SearchWordHistoryUS based on version 1216|256||59360a83e1421c1ec352de6f
2017-10-24T14:36:54.391+0800 I SHARDING [CatalogCacheLoader-16123] Refresh for collection appdata.SearchWordHistoryUS took 4 ms and found version 1216|256||59360a83e1421c1ec352de6f
2017-10-24T14:36:54.393+0800 W SHARDING [conn161597] can't accept new chunks because there are still 1 deletes from previous migration
2017-10-24T14:36:54.530+0800 I SHARDING [conn161598] Refreshing chunks for collection appdata.ASORank based on version 233434|5215||56f8eab11f4029ea4c3d4ccc
2017-10-24T14:36:54.547+0800 I SHARDING [conn161599] Refreshing chunks for collection appdata.ASORankUS based on version 49871|5445||594901feaa03025ae98e1d0c
2017-10-24T14:36:54.588+0800 I SHARDING [CatalogCacheLoader-16123] Refresh for collection appdata.ASORankUS took 40 ms and found version 49871|5445||594901feaa03025ae98e1d0c

so how can i found the previous migration chunk and deletes it with manually, and is there any way to solve this problem without restart a shard?



 Comments   
Comment by Kaloian Manassiev [ 25/Oct/17 ]

vegaoqiang, no problem. I presume your shards each are replica sets - is that correct? If so, this may potentially be related to SERVER-27009.

The message in question should look something like this:

2016-11-09T16:09:06.572+0000 I SHARDING [RangeDeleter] waiting for open cursors before removing range [{ build_id: "337bc5b6432ea606a010e4c95a5e5f9a", test_id: ObjectId('57f3eb919041302d8b03ffdf'), seq: 1 }, { build_id: "337c88bdf0f88e7c95d9ba482d042e71", test_id: ObjectId('57d1b969be07c42b9805e57f'), seq: 2 }) in buildlogs.logs, elapsed secs: 499819, cursor ids: [45904553724]

Comment by vegaoqiang [ 25/Oct/17 ]

Hi Mr. Kal,
Thank for you reply. sorry, i can't attach the complete log, because it is GB level. In the last night, I have to restart the problem shard, but next time I will use your solution, thank you!

Comment by Kaloian Manassiev [ 24/Oct/17 ]

Hi vegaoqiang,

The most likely reason for the outstanding deletes from a previous migration is that the orphan cleanup process is blocked behind open cursors. Finding and killing this cursor should unblock orphan cleanup.

Would it be possible to attach the complete log from the same shard node? There should be a message from the [RangeDeleter] thread, which indicates the cursors on which it is blocked.

Best regards,
-Kal.

Generated at Thu Feb 08 04:27:52 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.