Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-102111

Update write path in sharded cluster doesn't promote WCOS with WCE to top level

    • Type: Icon: Bug Bug
    • Resolution: Won't Fix
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • None
    • Cluster Scalability
    • ALL
    • 200
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      There are two bugs in batch write exec code path wherein a sub batch's error is not promoted to top-level error and the other where WCE error is not extracted correctly from a write error response from shard for a retryable write without shard key.

      Following is an explanation of how this happens:

      1. mongos sends the update which is a write without shard key to both the shards as part of two phase write protocol and receives {{ WouldChangeOwningShard }} error from the shard that has the document in query. This error is supposed to be propagated to ClusterWriteCmd::InvocationBase::runImpl later which will handle wouldChangeOwningShard error. However this doesn't happen correctly as noted in next steps.

      2. mongos aborts the transaction by sending abortTransaction to both the shards however one of them responds with a WCE.

      3. This WCE error from abortTransaction response is handled by taking the path where !responseStatus.isOK and processing the response in processErrorResponseFromLocal. This function passes the error as a WriteError to BatchWriteOp::noteBatchError where we have an emulated response that sets top level status to OK:1 incorrectly and it doesn't extract or parse the writeConcern error it received. There is logic in noteBatchResponse to extract WCE after it is called with an emulated response, but the branch is not taken.

      4. The WCOS error is sent as a write error to the place where we expect the router to catch it and retry but the router decides it does not need to be handled as the top level status doesn't contain WCOS error.

            Assignee:
            Unassigned Unassigned
            Reporter:
            abdul.qadeer@mongodb.com Abdul Qadeer
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated:
              Resolved:
              None
              None
              None
              None