Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-38284

Remove donor collection X-lock acquisition for starting the clone phase

    • Type: Icon: Task Task
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 4.1.9
    • Affects Version/s: None
    • Component/s: Sharding
    • Labels:
      None
    • Fully Compatible
    • Sharding 2018-12-17, Sharding 2018-12-31, Sharding 2019-01-14, Sharding 2019-01-28, Sharding 2019-02-11, Sharding 2019-02-25

      The collection X-lock acquisition when entering the migration clone phase is a necessary synchronization which serves two purposes:

      1. Removes the need for a mutex necessary for reading and writing to MSM* decoration of the CollectionShardingRuntime, where nullptr value means that writes are not tracked and non-nullptr value means that the current migration is tracking writes.
      2. Ensures that the chunk migration will start tracking writes to the chunk after all documents, which the clone phase will see have been journaled.

      Synchronization (1) should be implemented by introducing a lock manager ResourceMutex object on the MigrationSourceManager decoration and add a MigrationSourceManager::getCloner method, which returns a scoped object which holds this mutex in MODE_IX and has a bool and MigrationChunkClonerSource* operators, which return nullptr if there is no active migration or the active cloner. That way, all write code paths will acquire this mutex in mode IX, whereas migration start will acquire it in mode X when it installs the clone driver.

      Synchronization (2) can be implemented by waiting for the last written timestamp to become journaled (or even majority committed) before starting to clone the chunk. Because of this, collection X-lock acquisition can easily be replaced with a call to the replication coordinator’s waitUntilOpTimeForRead after the writes tracking for the chunk has been activated. That way it is guaranteed that all changes to the chunk will be captured either in the cloned snapshot or in xferMods.

      Xfermods for committed changes only
      Since we are removing a collection X-lock acquisition, which creates a barrier after which all active transactions on the collection have committed, we need to ensure that migration chunk cloner source doesn't miss writes that started before the migration (and would never had called the LogOpForShardingHandler of the migration manager). This will be achieved by ensuring that shardObserveInsertOp is only called for committed writes and that on transaction commit we call it for each document written for the migrated collection.

            Assignee:
            blake.oler@mongodb.com Blake Oler
            Reporter:
            kaloian.manassiev@mongodb.com Kaloian Manassiev
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated:
              Resolved: