Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Fixed
Priority: Major - P3
Fix Version/s: 3.4.14, 3.6.4, 3.7.3
Affects Version/s: 3.2.18, 3.4.10, 3.6.2
Component/s: Sharding
Labels:
None

Backwards Compatibility:
Fully Compatible
Operating System:
ALL
Backport Requested:

v3.6, v3.4
Sprint:
Sharding 2018-02-12, Sharding 2018-02-26
Case:
Confidence Status:
None
Work Order:
3
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

From visual inspection of the chunk migration code, there are at least 3 places where nodes can unnecessarily sleep for up to a second:

On the donor, during awaitUntilCriticalSectionIsAppropriate, if the clone phase takes more than 500 msec, we would waste half a second on average in idle time before the donor can enter the critical section.
On the recipient, during the catchup phase
On the recipient, during the final majority commit

These sleeps unnecessarily increase the length of the balancer round, pollute the logs and on top of that could actually cause more mods to accumulate on the donor, potentially increasing the duration of the critical section catch-up phase.

Most likely they are artifacts of MMAP V1 where chunk migration was intentionally slowed down in order not to interfere with live workload, but no longer make sense for WT. If we remove them we should still preserve some comparable throttle for MMAP V1.

Assignee:: Janna Golden
Reporter:: Kaloian Manassiev
Participants:: Githook User, Janna Golden, Kaloian Manassiev
Votes:: 1 Vote for this issue
Watchers:: 16 Start watching this issue

Created:: Jan 25 2018 12:57:15 AM UTC
Updated:: Oct 30 2023 11:09:01 PM UTC
Resolved:: Feb 14 2018 02:47:38 AM UTC
Confidence Status Last Update:: 01/Feb/18 8:34 PM

Details

Description

Attachments

Forms

Activity

People

Dates