Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-12900

moveChunk failed to engage TO-shard in the data transfer: still waiting for a previous migrates data to get cleaned, can't accept new chunks

    XMLWordPrintableJSON

Details

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Major - P3 Major - P3
    • None
    • None
    • None
    • None
    • ALL

    Description

      I've been having a problem all day with my MongoDB cluster getting stuck while migrating chunks.

      This is a 8 shard cluster, MongoDB 2.4.6, with each shard a 5 member replica set. I have 2 shards both trying to migrate chunks:

      first primary:

      Feb 25 20:46:55 terranova mongod.10001[19778]: Tue Feb 25 20:46:55.986 [conn4194] command admin.$cmd command: { moveChunk: "current.reviews", from: "rs1/mongors1-2.redacted.com:10001,mongors1-3.redacted.com:10001,terranova.redacted.com:10001", to: "rs3/bastoni.redacted.com:10003,mongors3-2.redacted.com:10003,mongors3-3.redacted.com:10003", fromShard: "rs1", toShard: "rs3", min:

      { location_id: ObjectId('52a75b4738fc9d23e88ab516') }

      , max:

      { location_id: ObjectId('52a75b4814b0f61b64acffc8') }

      , maxChunkSizeBytes: 67108864, shardId: "current.reviews-location_id_ObjectId('52a75b4738fc9d23e88ab516')", configdb: "terranova.redacted.com:30000,giordano.redacted.com:30000,bastoni.redacted.com:30000", secondaryThrottle: true, waitForDelete: false } ntoreturn:1 keyUpdates:0 locks(micros) W:3854 r:85284 reslen:343 559ms

      8th primary:

      Feb 25 20:47:40 MongoRS8-1 mongod.10008[31876]: Tue Feb 25 20:47:40.430 [conn1993] received moveChunk request: { moveChunk: "current.citations", from: "rs8/mongors8-1.redacted.com:10008,mongors8-2.redacted.com:10008,mongors8-3.redacted.com:10008", to: "rs4/mongors4-1.redacted.com:10004,mongors4-2.redacted.com:10004,mongors4-3.redacted.com:10004", fromShard: "rs8", toShard: "rs4", min:

      { location_id: ObjectId('4f2703f0bc0f367032000000') }

      , max:

      { location_id: ObjectId('4f3533ffbc0f36d419000002') }

      , maxChunkSizeBytes: 67108864, shardId: "current.citations-location_id_ObjectId('4f2703f0bc0f367032000000')", configdb: "terranova.redacted.com:30000,giordano.redacted.com:30000,bastoni.redacted.com:30000", secondaryThrottle: true, waitForDelete: false }

      On both primaries I am seeing:

      warning: moveChunk failed to engage TO-shard in the data transfer: still waiting for a previous migrates data to get cleaned, can't accept new chunks, num threads: 39

      In the last 6 hours, 3 chunks have managed to migrate.

      I have bounced mongos's, bounced mongod's. Stopped the balancer, removed the {_id: "balancer"} lock from the config db, bounced the mongos's, reenabled the balancer.

      Not seeing anything out of the ordinary on the receivers.

      Strange thing, is that 3 chunks have been able to migrate amidst this issue. The errors disappear, there is some chunk migration log messages, it finished, and then the errors start up again.

      Attachments

        Activity

          People

            Unassigned Unassigned
            quickdry21 Eric Coutu
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: