Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-12587

Better error responses for errors in the destination shard

    XMLWordPrintableJSON

Details

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Minor - P4 Minor - P4
    • None
    • 2.5.5
    • Sharding
    • None
    • Sharding
    • ALL
    • 0

    Description

      For example currently moveChunk response on a _recvChunkCommit looks like this:

      {
      	"cause" : {
      		"cause" : {
      			"active" : true,
      			"ns" : "test.foo",
      			"from" : "localhost:30000",
      			"min" : {
      				"_id" : { "$minKey" : 1 }
      			},
      			"max" : {
      				"_id" : { "$maxKey" : 1 }
      			},
      			"shardKeyPattern" : {
      				"_id" : 1
      			},
      			"state" : "fail",
      			"errmsg" : "",
      			"counts" : {
      				"cloned" : NumberLong(41847),
      				"clonedBytes" : NumberLong(420520503),
      				"catchup" : NumberLong(0),
      				"steady" : NumberLong(0)
      			},
      			"ok" : 0
      		},
      		"ok" : 0,
      		"errmsg" : "_recvChunkCommit failed!"
      	},
      	"ok" : 0,
      	"errmsg" : "move failed"
      }
      

      Note that cause.cause.errmsg is empty. And for this particular example, _recvChunkCommit timedout and the cause.cause field is populated from the _recvChunkCommit response.

      The same also applies for any errors that occurred on the migrate thread that aborts the migration - the "to" shard would usually realize something went bad through _recvChunkStatus, but it does not contain the information on why it went bad. The only way to figure it out what went wrong currently is to check the config.changelog or the logs.

      Attachments

        Activity

          People

            backlog-server-sharding [DO NOT USE] Backlog - Sharding Team
            randolph@mongodb.com Randolph Tan
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: