[SERVER-24732] add a runBatchWriteCommand() method to Shard that checks for errors by unwrapping the BatchedCommandResponse Created: 22/Jun/16  Updated: 19/Jul/16  Resolved: 12/Jul/16

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: 3.3.8
Fix Version/s: 3.3.10

Type: Improvement Priority: Major - P3
Reporter: Esha Maharishi (Inactive) Assignee: Jess Fan
Resolution: Done Votes: 0
Labels: neweng
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
is related to SERVER-24828 ShardingCatalogClientImpl::_runBatchW... Closed
is related to SERVER-24379 Implement addShardToZone and _configs... Closed
Backwards Compatibility: Fully Compatible
Sprint: Sharding 17 (07/15/16)
Participants:

 Description   

The current Shard::runCommand() method only checks for the request status, command status, and write concern error.

Since Shard::runCommand simply takes a BSONObj cmdObj, it's dangerously easy to pass in a BatchedCommandRequest in BSONObj form and assume that retries and error checking will be correctly handled.

Status _getEffectiveCommandStatus(StatusWith<Shard::CommandResponse> cmdResponse) {
    // Make sure the command even received a valid response
    if (!cmdResponse.isOK()) {
        return cmdResponse.getStatus();
    }
 
    // If the request reached the shard, check if the command itself failed.
    if (!cmdResponse.getValue().commandStatus.isOK()) {
        return cmdResponse.getValue().commandStatus;
    }
 
    // Finally check if the write concern failed
    if (!cmdResponse.getValue().writeConcernStatus.isOK()) {
        return cmdResponse.getValue().writeConcernStatus;
    }
 
    return Status::OK();
}



 Comments   
Comment by Githook User [ 12/Jul/16 ]

Author:

{u'name': u'Jess Fan', u'email': u'jess.fan@10gen.com'}

Message: SERVER-24732 add Shard::runBatchWriteCommand that checks BatchedCommandResponse
Branch: master
https://github.com/mongodb/mongo/commit/907ed32a3a8bd19f883836013530f645522a75bc

Comment by Spencer Brody (Inactive) [ 06/Jul/16 ]

As part of this, the new runBatchWriteCommand method on Shard should return BatchedCommandResponse, rather then taking a BatchedCommandResponse* output parameter

Comment by Spencer Brody (Inactive) [ 30/Jun/16 ]

We should also make sure that as part of this we start informing the ReplicaSetMonitor of errors in the BatchedCommandReply (a la SERVER-24828)

Comment by Esha Maharishi (Inactive) [ 30/Jun/16 ]

There are a bunch of jstests blacklisted from the sharding_continuous_config_stepdown suite because "[write] commands against the config shard do not support retries yet."

https://github.com/mongodb/mongo/blob/9b00106b56966b334c878f36cca14deb71f6d8c7/buildscripts/resmokeconfig/suites/sharding_continuous_config_stepdown.yml#L12

ShardingCatalogClient::runBatchWriteCommand does not uses Shard's retry logic, but rather takes a retry policy and does its own retries.

https://github.com/mongodb/mongo/blob/a0351d0713a20f50d9084f98b4df0de11cfabe23/src/mongo/s/catalog/replset/sharding_catalog_client_impl.cpp#L1293

It should probably be investigated whether any of those are blacklisted due to assumptions about not retrying writes from mongos.

Comment by Esha Maharishi (Inactive) [ 24/Jun/16 ]

ShardingCatalogManager::_processBatchWriteResponse and ShardingCatalogClient::_processBatchWriteResponse (which duplicated each other) were replaced with Shard::CommandResponse::processBatchWriteResponse in SERVER-24379.

This ticket should add a Shard::runBatchWriteCommand that replaces ShardingCatalogClient::_runBatchWriteCommand and uses the new Shard::CommandResponse::processBatchWriteResponse for its retry logic.

Generated at Thu Feb 08 04:07:15 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.