Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-65059

Election during split may result in inability to observe recipient split acceptance

    XMLWordPrintableJSON

Details

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Major - P3 Major - P3
    • 6.1.0-rc0
    • None
    • None
    • None
    • Fully Compatible
    • ALL
    • Server Serverless 2022-04-18

    Description

      In the following scenario :

      1. ShardSplitDonorService commits but steps down before being marked for garbage collection.
      2. Step up with local state as `kCommitted` but there is no longer recipient nodes in the set.

      The Replica Set Monitor fails to monitor the recipient nodes because they were removed from the config and the recipient connection string can no longer be built which results in connection failures such as :

      [js_test:shard_split_startup_recovery] d20021| {"t":

      {"$date":"2022-03-29T21:50:48.952+00:00"}

      ,"s":"I",  "c":"-",        "id":4333222, "ctx":"ShardSplitDonorService-3","msg":"RSM received error response","attr":{"host":"ip-10-122-14-181:20023","error":"HostUnreachable: Connection re     fused","replicaSet":"","response":{}}}

       

      While this was observed in the `kCommitted` case, this can also happen while the donor waits for the recipient to accept the split. If at this point an election occurs and a donor secondary steps up to continue the split operation, it will not observe the split acceptance since the connection string cannot be built.

      Attachments

        Activity

          People

            matt.broadstone@mongodb.com Matt Broadstone
            mathis.bessa@mongodb.com Mathis Bessa
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: