Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-67457

Resharding operation aborted in the midst of contacting participants may stall on config server primary indefinitely

    • Fully Compatible
    • ALL
    • v6.0, v5.0
    • Sharding 2022-06-27, Sharding 2022-07-11
    • 5
    • 3

      After the resharding coordinator has transitioned into the "preparing-to-donate" state, it is required to establish the DonorStateMachines and RecipientStateMachines on the participant shards before proceeding with the remainder of the resharding operation. This synchronization every participant shard is aware of the resharding operation and will react accordingly to a subsequent _flushReshardingStateChange, _shardsvrCommitReshardCollection, or _shardsvrAbortReshardCollection command. The logic prior to _tellAllParticipantsReshardingStarted() is flawed because it possible for the resharding coordinator to

      1. Atomically write the config.collections and config.chunks entries for the temporary resharding collection, advance the state in the config.collections entry to be "preparing-to-donate" for the collection being resharded, and advance the state in config.reshardingOperations to be "preparing-to-donate".
      2. Receive an abortReshardCollection command either explicitly from the user or implicitly via a setFeatureCompatibilityVersion command.
      3. Skip waiting for the replica set transaction in step (1) to become majority-committed.
      4. Broadcast the _flushRoutingTableCacheUpdatesWithWriteConcern command to all participant shards using a $configTime prior to the replica set transaction from step (1) because the Timestamp from step (1) hasn't become majority-committed yet.

      Shards receive the _flushRoutingTableCacheUpdatesWithWriteConcern command and observe a state of config.collections for the source collection and temporary resharding collection entries prior to the replica set transaction from step (1). In particular, the recipient shards would not observe the config.collections entry for the temporary resharding collection at all and would treat the namespace as unsharded. The shards therefore skip constructing the DonorStateMachine and RecipientStateMachine objects but responded ok:1 to the resharding coordinator as if they had.

      The resharding coordinator continues to wait for the participant shards to update their state within the config.reshardingOperations document to "done" and signal they've finished their cleanup for the resharding operation. However, because the participants shards never constructed the DonorStateMachine and RecipientStateMachine object, they'll also never perform that update on the config.reshardingOperations document. This leads the resharding coordinator to wait indefinitely on this future.

      Manual intervention on the config.reshardingOperations document would be required to unblock the resharding coordinator. The source collection will be unable to perform other sharding DDL commands in the meantime.

      12118:[js_test:setfcv_reshard_collection] d20276| {"t":{"$date":"2022-06-14T01:50:03.173+00:00"},"s":"I",  "c":"SHARDING", "id":21985,   "ctx":"RecoverRefreshThread","msg":"Updating metadata for this namespace because the remote metadata has a newer collection version","attr":{"namespace":"reshardingDb.testColl","activeMetadata":"collection version: 1|5||62a7e944c481ca50994b1490||Timestamp(1655171396, 41), shard version: 1|0||62a7e944c481ca50994b1490||Timestamp(1655171396, 41)","remoteMetadata":"collection version: 1|7||62a7e944c481ca50994b1490||Timestamp(1655171396, 41), shard version: 1|0||62a7e944c481ca50994b1490||Timestamp(1655171396, 41)"}}
      12121:[js_test:setfcv_reshard_collection] d20278| {"t":{"$date":"2022-06-14T01:50:03.175+00:00"},"s":"I",  "c":"SHARDING", "id":21985,   "ctx":"RecoverRefreshThread","msg":"Updating metadata for this namespace because the remote metadata has a newer collection version","attr":{"namespace":"reshardingDb.testColl","activeMetadata":"collection version: 1|5||62a7e944c481ca50994b1490||Timestamp(1655171396, 41), shard version: 1|5||62a7e944c481ca50994b1490||Timestamp(1655171396, 41)","remoteMetadata":"collection version: 1|7||62a7e944c481ca50994b1490||Timestamp(1655171396, 41), shard version: 1|7||62a7e944c481ca50994b1490||Timestamp(1655171396, 41)"}}
      12138:[js_test:setfcv_reshard_collection] d20278| {"t":{"$date":"2022-06-14T01:50:03.188+00:00"},"s":"I",  "c":"SHARDING", "id":21917,   "ctx":"RecoverRefreshThread","msg":"Marking collection as unsharded","attr":{"namespace":"reshardingDb.system.resharding.face3185-7dbc-4460-adc9-c6c9a1603801"}}
      12139:[js_test:setfcv_reshard_collection] d20276| {"t":{"$date":"2022-06-14T01:50:03.189+00:00"},"s":"I",  "c":"SHARDING", "id":21917,   "ctx":"RecoverRefreshThread","msg":"Marking collection as unsharded","attr":{"namespace":"reshardingDb.system.resharding.face3185-7dbc-4460-adc9-c6c9a1603801"}}
      

            Assignee:
            abdul.qadeer@mongodb.com Abdul Qadeer
            Reporter:
            max.hirschhorn@mongodb.com Max Hirschhorn
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: