Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-64610

Stale shardVersion error in catalog shard POC

    • Fully Compatible
    • Sharding NYC 2022-04-04, Sharding NYC 2022-04-18, Sharding 2022-05-02, Sharding NYC 2022-05-16
    • 4

      Update: root cause:

      The stale DB error generated by the DatabaseShardingState is supposed to be resolved by internal retry inside the ExecCommandDatabase::_commandExec() by handling the StaleDbVersion error and retrying it by calling refreshDatabase() and then recursively calling _commandExec().

      This logic was gated by checking this is not config server, because we do not have the config server as primary for any DB. The fix posted is to handle catalog server differently from the standalone config server.

      Repro:

      buildscripts/resmoke.py run --suite sharded_jscore_txns --numShards=1 --numReplSetNodes=3 --catalogShard=any jstests/core/rename_collection_long_name.js

      Error:

      [js_test:rename_collection_long_name] uncaught exception: Error: listIndexes failed: {
      [js_test:rename_collection_long_name] 	"ok" : 0,
      [js_test:rename_collection_long_name] 	"errmsg" : "got stale shardVersion response from shard shard-rs0 at host localhost:20000 :: caused by :: sharding status of collection test.renameSRC is not currently known and needs to be recovered",
      [js_test:rename_collection_long_name] 	"code" : 13388,
      [js_test:rename_collection_long_name] 	"codeName" : "StaleConfig",
      [js_test:rename_collection_long_name] 	"ns" : "test.renameSRC",
      [js_test:rename_collection_long_name] 	"vReceived" : Timestamp(0, 0),
      [js_test:rename_collection_long_name] 	"vReceivedEpoch" : ObjectId("000000000000000000000000"),
      [js_test:rename_collection_long_name] 	"vReceivedTimestamp" : Timestamp(0, 0),
      [js_test:rename_collection_long_name] 	"shardId" : "shard-rs0",
      [js_test:rename_collection_long_name] 	"$clusterTime" : {
      [js_test:rename_collection_long_name] 		"clusterTime" : Timestamp(1647532487, 45),
      [js_test:rename_collection_long_name] 		"signature" : {
      [js_test:rename_collection_long_name] 			"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
      [js_test:rename_collection_long_name] 			"keyId" : NumberLong(0)
      [js_test:rename_collection_long_name] 		}
      [js_test:rename_collection_long_name] 	},
      [js_test:rename_collection_long_name] 	"operationTime" : Timestamp(1647532487, 45)
      [js_test:rename_collection_long_name] } :
      [js_test:rename_collection_long_name] _getErrorWithCode@src/mongo/shell/utils.js:24:13
      [js_test:rename_collection_long_name] DBCollection.prototype.getIndexes@src/mongo/shell/collection.js:753:15
      [js_test:rename_collection_long_name] @jstests/core/rename_collection_long_name.js:34:37
      [js_test:rename_collection_long_name] @jstests/core/rename_collection_long_name.js:43:3
      
      

            Assignee:
            andrew.shuvalov@mongodb.com Andrew Shuvalov (Inactive)
            Reporter:
            andrew.shuvalov@mongodb.com Andrew Shuvalov (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: