Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-66031

Command(s) hang when specifying collectionUUID for unsharded collection on sharded cluster

    XMLWordPrintable

Details

    • Bug
    • Status: Closed
    • Major - P3
    • Resolution: Fixed
    • None
    • 6.0.0-rc6, 6.1.0-rc0
    • None
    • None
    • Fully Compatible
    • ALL
    • v6.0
    • Hide
      1. Create a sharded cluster with >= 2 shards.
      2. Create an unsharded collection on the primary shard.
      3. Get the unsharded collection's UUID.
      4. Issue a Rename command to rename the collection to something else while specifying the collectionUUID parameter with the UUID from step 3.
      5. The command will hang. A server log line like this can be observed about once every second on a participant shard:

      {"t":\{"$date":"2022-04-27T16:15:22.026-04:00"},"s":"E", "c":"SHARDING", "id":6372200, "ctx":"RenameCollectionParticipantService-5","msg":"Error executing rename collection participant. Going to be retried.","attr":\{"fromNs":"myDB.B","toNs":"myDB.A","error":"CollectionUUIDMismatch{ db: \"myDB\", collectionUUID: UUID(\"08d3be0f-f7e4-402e-b6ac-c6998299df57\"), expectedCollection: \"B\", actualCollection: null }: Collection UUID does not match that specified"}}
      

      ^ We can confirm that the collection doesn't exist on the participant shard since the error contains `actualCollection: null`

      Show
      Create a sharded cluster with >= 2 shards. Create an unsharded collection on the primary shard. Get the unsharded collection's UUID. Issue a Rename command to rename the collection to something else while specifying the collectionUUID parameter with the UUID from step 3. The command will hang. A server log line like this can be observed about once every second on a participant shard: {"t":\{"$date":"2022-04-27T16:15:22.026-04:00"},"s":"E", "c":"SHARDING", "id":6372200, "ctx":"RenameCollectionParticipantService-5","msg":"Error executing rename collection participant. Going to be retried.","attr":\{"fromNs":"myDB.B","toNs":"myDB.A","error":"CollectionUUIDMismatch{ db: \"myDB\", collectionUUID: UUID(\"08d3be0f-f7e4-402e-b6ac-c6998299df57\"), expectedCollection: \"B\", actualCollection: null }: Collection UUID does not match that specified"}} ^ We can confirm that the collection doesn't exist on the participant shard since the error contains `actualCollection: null`
    • Sharding EMEA 2022-05-16, Sharding EMEA 2022-05-30
    • 197

    Description

      There's a hanging issue when specifying the collectionUUID parameter to a Rename for an unsharded collection on a sharded cluster with participant shard(s). There could be other commands with logic similar to Rename that are affected (e.g. possibly Drop and ShardCollection) but this was only attempted and observed with Rename.

      From a conversation with max.hirschhorn@mongodb.com, it seems like the problem lies in Rename's policy to broadcast the _shardsvrRenameCollectionParticipant command with the expectedSourceUUID and expectedTargetUUID to all shards even when those shards don't own any data for the collection. Upon issuing a Rename, the unsharded collection's UUID is successfully found on the coordinator shard in the kCheckPreconditions phase, which then allows us to continue to the kFreezeMigrations and kBlockCrudAndRename phases. The coordinator shard sends the Rename to participant(s) as-is with the collectionUUID parameter, which will always result in a CollectionUUIDMismatch error since the collection is unsharded and therefore doesn't exist on the participant. The coordinator shard must retry _shardsvrRenameCollectionParticipant until it succeeds on all of the participants to avoid the Rename succeeding partially only on some shards. But the Rename won't ever succeed when the collection doesn't exist on the participant and the collectionUUID parameter has been specified.

      Attachments

        Issue Links

          Activity

            People

              pierlauro.sciarelli@mongodb.com Pierlauro Sciarelli
              evgeni.dobranov@mongodb.com Evgeni Dobranov
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: