Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-53653

[Resharding] Take the critical section when renaming on recipient shards

    XMLWordPrintable

    Details

    • Backwards Compatibility:
      Fully Compatible
    • Sprint:
      Sharding 2021-03-22, Sharding 2021-04-05, Sharding EMEA 2021-05-03, Sharding EMEA 2021-05-17
    • Linked BF Score:
      141
    • Story Points:
      2

      Description

      Background

      After a coordinator transitions to kRenaming, there exists a potential gap in-between these two events:

      1. The recipient shard updates its routing info, and
      2. The recipient shard renames the collection locally.

      In this gap, it's possible that a router can attempt to read from this collection when the collection doesn't actually exist at the storage level. This might culminate in a NamespaceNotFound error, which isn't considered retryable.

      Solution

      In order to prevent this, on a given recipient shard, we will need to take the CSR's critical section from before the point in which the refresh completes, up until the rename itself has been completed.

      To do this, create a resharding-specific RAII type that can be fed a new opCtx for entering/exiting the critical section. As part of the destruction of this RAII type, it's important to leave the critical section, so that if the resharding operation errors out, the shard isn't permanently stuck in the critical section.

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              sergi.mateo-bellido Sergi Mateo Bellido
              Reporter:
              blake.oler Blake Oler
              Participants:
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: