Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-1966

Shared cache distribution with a single participant

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major - P3
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: WT2.7.0
    • Labels:
      None
    • # Replies:
      8
    • Last comment by Customer:
      true

      Description

      There are cases where a shared cache configured with a max of 20GB, and a single participant. The single participant ends up being allocated only 20MB of the available space.

      Review the shared cache algorithm to ensure that a reasonable amount of space is allocated.

      The conflict here is that it's (relatively) slow to reclaim space from the shared cache, so allowing a single member to use the entire shared cache will make it slow for new members to get ramped up.

      The particular workload does single threaded inserts across several tables, whilst constantly running checkpoints.

        Issue Links

          Activity

          Hide
          alexander.gorrod Alexander Gorrod added a comment -

          While thinking about this - it would be nice if the manager could detect that a participant is in the WT_CACHE_STUCK state - and if so to bump it's allocated space.

          Show
          alexander.gorrod Alexander Gorrod added a comment - While thinking about this - it would be nice if the manager could detect that a participant is in the WT_CACHE_STUCK state - and if so to bump it's allocated space.
          Hide
          alexander.gorrod Alexander Gorrod added a comment -

          The cache allocation for the single member is never getting above the initial 20MB allocation.

          Show
          alexander.gorrod Alexander Gorrod added a comment - The cache allocation for the single member is never getting above the initial 20MB allocation.
          Hide
          alexander.gorrod Alexander Gorrod added a comment -

          The change in https://github.com/wiredtiger/wiredtiger/pull/2023 isn't sufficient to fully fix the issue.

          If an application has a single insert thread, then the shared cache balancer isn't adequate. It relies on read pressure which remains low in that case.

          I've seen cases where threads are waiting in cache_full_check, while there is lots of space in the shared cache. Waiting application threads should feed into the shared cache allocation algorithm.

          I've seen cases where the internal page count is taking up over 90% of the cache size - the proportion of internal pages should be taken into account in the shared cache allocation algorithm.

          Show
          alexander.gorrod Alexander Gorrod added a comment - The change in https://github.com/wiredtiger/wiredtiger/pull/2023 isn't sufficient to fully fix the issue. If an application has a single insert thread, then the shared cache balancer isn't adequate. It relies on read pressure which remains low in that case. I've seen cases where threads are waiting in cache_full_check, while there is lots of space in the shared cache. Waiting application threads should feed into the shared cache allocation algorithm. I've seen cases where the internal page count is taking up over 90% of the cache size - the proportion of internal pages should be taken into account in the shared cache allocation algorithm.
          Hide
          alexander.gorrod Alexander Gorrod added a comment -

          The shared cache is currently too naive to capture state adequately and allocate resources. It currently looks at cache read pressure, which isn't always adequate. Other things I think the shared cache balancing code needs to look at are:

          • Cache write pressure
          • If application threads are waiting for space
          • If application threads are contributing to eviction
          • If the eviction server is stuck

          The shared cache server should also consider how much of the allocated pool is currently distributed.

          Adding these checks will require making the shared cache balance algorithm more sophisticated. Hopefully each participant can be assigned a pressure score, and the space gets assigned based on the pressure score.

          Show
          alexander.gorrod Alexander Gorrod added a comment - The shared cache is currently too naive to capture state adequately and allocate resources. It currently looks at cache read pressure, which isn't always adequate. Other things I think the shared cache balancing code needs to look at are: Cache write pressure If application threads are waiting for space If application threads are contributing to eviction If the eviction server is stuck The shared cache server should also consider how much of the allocated pool is currently distributed. Adding these checks will require making the shared cache balance algorithm more sophisticated. Hopefully each participant can be assigned a pressure score, and the space gets assigned based on the pressure score.
          Hide
          xgen-internal-githook Githook User added a comment -

          Author:

          {u'username': u'agorrod', u'name': u'Alex Gorrod', u'email': u'alexg@wiredtiger.com'}

          Message: Change how the shared cache assigns priority to participants.

          The old code implicitly assumed at least 10 members. The new
          code uses a pressure based on a percentage of the most active
          member.

          refs WT-1966
          Branch: develop
          https://github.com/wiredtiger/wiredtiger/commit/c49e4cab5ef048776cd5d7a1c4413c9f88fea716

          Show
          xgen-internal-githook Githook User added a comment - Author: {u'username': u'agorrod', u'name': u'Alex Gorrod', u'email': u'alexg@wiredtiger.com'} Message: Change how the shared cache assigns priority to participants. The old code implicitly assumed at least 10 members. The new code uses a pressure based on a percentage of the most active member. refs WT-1966 Branch: develop https://github.com/wiredtiger/wiredtiger/commit/c49e4cab5ef048776cd5d7a1c4413c9f88fea716
          Hide
          xgen-internal-githook Githook User added a comment -

          Author:

          {u'username': u'agorrod', u'name': u'Alex Gorrod', u'email': u'alexg@wiredtiger.com'}

          Message: WT-1966 Allocate aggressively from shared cache if the pool is under utilized.
          Branch: develop
          https://github.com/wiredtiger/wiredtiger/commit/1c195414ceddd0a7031754c8e7324f44731e0953

          Show
          xgen-internal-githook Githook User added a comment - Author: {u'username': u'agorrod', u'name': u'Alex Gorrod', u'email': u'alexg@wiredtiger.com'} Message: WT-1966 Allocate aggressively from shared cache if the pool is under utilized. Branch: develop https://github.com/wiredtiger/wiredtiger/commit/1c195414ceddd0a7031754c8e7324f44731e0953
          Hide
          xgen-internal-githook Githook User added a comment -

          Author:

          {u'username': u'agorrod', u'name': u'Alex Gorrod', u'email': u'alexg@wiredtiger.com'}

          Message: WT-1966 Shared cache use weighted pressure.

          I calculated it before, but wasn't using the value properly.
          Branch: develop
          https://github.com/wiredtiger/wiredtiger/commit/fc7138a63993c076b17c8ff947b09c7cc8aded2c

          Show
          xgen-internal-githook Githook User added a comment - Author: {u'username': u'agorrod', u'name': u'Alex Gorrod', u'email': u'alexg@wiredtiger.com'} Message: WT-1966 Shared cache use weighted pressure. I calculated it before, but wasn't using the value properly. Branch: develop https://github.com/wiredtiger/wiredtiger/commit/fc7138a63993c076b17c8ff947b09c7cc8aded2c
          Hide
          xgen-internal-githook Githook User added a comment -

          Author:

          {u'username': u'michaelcahill', u'name': u'Michael Cahill', u'email': u'michael.cahill@mongodb.com'}

          Message: Merge pull request #2023 from wiredtiger/shared-cache-proportion

          WT-1966 Change how the shared cache assigns priority to participants.
          Branch: develop
          https://github.com/wiredtiger/wiredtiger/commit/60e2150920694e79e2e7b6f1b215de7d6415c286

          Show
          xgen-internal-githook Githook User added a comment - Author: {u'username': u'michaelcahill', u'name': u'Michael Cahill', u'email': u'michael.cahill@mongodb.com'} Message: Merge pull request #2023 from wiredtiger/shared-cache-proportion WT-1966 Change how the shared cache assigns priority to participants. Branch: develop https://github.com/wiredtiger/wiredtiger/commit/60e2150920694e79e2e7b6f1b215de7d6415c286

            People

            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:
                Days since reply:
                1 year, 46 weeks ago
                Date of 1st Reply: