Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-56185

Investigate possible improvements with session migration and a chunk migration's critical section

    • Type: Icon: Improvement Improvement
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 5.2.0, 4.4.17, 5.0.11
    • Affects Version/s: 5.0.0, 4.2.13, 4.4.5, 4.0.24
    • Component/s: Sharding
    • None
    • Fully Compatible
    • v5.3, v5.0, v4.4, v4.2
    • Sharding EMEA 2021-09-20, Sharding EMEA 2021-10-04, Sharding EMEA 2021-10-18, Sharding EMEA 2021-11-01

      In current MongoDB code, the critical section will be entered regardless of whether the session migration has completed. This makes sense from a correctness standpoint – we must block writes before ensuring that all sessions associated with those writes have been migrated over. However, there exists the question of whether it's possible to coordinate the choice to enter the critical section with specific checkpoints in the critical section. Without such checkpointing, a migration could still take a long time completing the session migration while blocking writes.

      An example would be ensuring that all entries that originally existed in the on-disk session catalog when migration started have been copied over before entering the critical section. 

      The cost of waiting on session migration checkpointing before entering the critical section is that it would allow more writes to come in, causing transferMods to take longer as well.

      This ticket is to investigate the possibility of a middle ground with checkpointing session migrations while not prolonging the length of the migration unnecessarily. Other solutions to this problem are also welcome.

            Assignee:
            allison.easton@mongodb.com Allison Easton
            Reporter:
            blake.oler@mongodb.com Blake Oler
            Votes:
            0 Vote for this issue
            Watchers:
            20 Start watching this issue

              Created:
              Updated:
              Resolved: