Loading...

XML

Word

Printable

JSON

Type: Improvement
Resolution: Done
Priority: Major - P3
Fix Version/s: None
Affects Version/s: None
Component/s: Sharding
Labels:
- sharding-wfbf-day

Assigned Teams:

Sharding EMEA
Sprint:
Sharding EMEA 2023-04-17
Confidence Status:
None
Work Order:
3
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

Consider the case where:

large volume of insertion
migration is slow due to slow hardware and many indices (e.g. 20)
consequently moveChunk operation takes a long time (e.g. 1 min)
consequently any split fail during that time since the ns is locked, and chunks become larger.
consequently chunks become even longer to move... This downward spiral makes thing worse and worse
eventually chunks cannot be moved at all. The migration gets aborted after some minutes and no progress is made at all. But the system is super busy all the time trying to migrate those documents.

I think we need several server improvements:

A. any chunk migration abort due to timeout should result in a split. If anything the split wont hurt. Right now the split seems to be for a specific case only.

B. ideally the migration process would avoid retrying the same chunk over and over. May need some amount of randomization on candidate chunks.

C. when mongos fails to split due to NS locked, it should mark the metadata as "needs split" for later. Ideally all "need split" should be cleared before the next migration is attempted.

This is all to avoid the bad catch 22 problems where large chunks end up clogging the whole system.

related to

SERVER-44088 Autosplitter seems to ignore some fat chunks

Closed

Assignee:: Pierlauro Sciarelli
Reporter:: Antoine Girbal (Inactive)
Participants:: Antoine Girbal, Kevin J. Rice, Nic Cottrell, Pierlauro Sciarelli
Votes:: 1 Vote for this issue
Watchers:: 6 Start watching this issue

Created:: Jun 25 2013 06:14:59 PM UTC
Updated:: Apr 14 2023 10:21:05 AM UTC
Resolved:: Apr 14 2023 10:21:05 AM UTC

Details

Description

Attachments

Issue Links

Forms

Activity

People

Dates