Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-23790

in ShardRegistry's runCommand, make commands that fail with a non-retryable error return immediately

    XMLWordPrintableJSON

Details

    • Icon: Task Task
    • Resolution: Won't Fix
    • Icon: Major - P3 Major - P3
    • None
    • 3.3.4
    • Sharding
    • None
    • Sharding 13 (04/22/16), Sharding 14 (05/13/16)

    Description

      Currently, if a command fails with a non-retriable error, we check for write concern errors and potentially return a write concern error status that overrides the command error status:

      https://github.com/mongodb/mongo/blob/bdc06761206ac398af04f0a2eb482c4dca49bad8/src/mongo/s/client/shard_registry.cpp#L779-L786

      It is non-trivial to untangle the way we handle the different types of errors, which can be due to request failure, command failure, or write concern failure, for a few reasons:

      ----- Issue 1 -----

      If a write concern error is present in the command response, it is converted to a WriteConcernFailed error, regardless of what kind of write concern error it was:

      https://github.com/mongodb/mongo/blob/bdc06761206ac398af04f0a2eb482c4dca49bad8/src/mongo/s/client/shard_registry.cpp#L135-L141

      WriteConcernFailed is part of kAllRetriableErrors, but other types of write concern errors are not:

      https://github.com/mongodb/mongo/blob/bdc06761206ac398af04f0a2eb482c4dca49bad8/src/mongo/s/client/shard_registry.cpp#L153-L170

      ---- Issue 2 ----

      The jstests that check for write concern error behavior, such as

      commands_that_write_accept_wc_configRS.js

      expect the client to see a command error (ok: 0) on even non-WriteConcernFailed write concern errors, which currently works because:

      1) the non-WriteConcernFailed write concern error is returned and converted into WriteConcernFailed
      2) since WriteConcernFailed is retriable, the command is retried
      3) the command now fails because it previously succeeded (for example, dropCollection fails with NamespaceNotFound), but with a non-retriable error
      4) since the error is non-retriable, we check the write concern detail
      5) the non-WriteConcernFailed write concern error is returned as an error status
      6) the test succeeds since it sees a command error

      It would be better for non-WriteConcernFailed write concern errors to be returned as part of the write concern detail, and for the test to check for them there instead of checking if the command failed.

      ---- Issue 3 ----

      Higher up error handling code expects non-retriable errors to be returned within the command response (they call getStatusFromCommandResult() themselves), but retriable errors to be returned as an error status.

      Either the higher up error handling code should be unified to handle retriable and non-retriable errors the same way, or the code in ShardRegistry's runCommand should return the command response on non-retriable errors but an error status on retriable errors (when you're out of retries).

      Attachments

        Activity

          People

            spencer@mongodb.com Spencer Brody (Inactive)
            esha.maharishi@mongodb.com Esha Maharishi (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: