Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Done
Priority: Major - P3
Fix Version/s: None
Affects Version/s: None
Component/s: None
Labels:
None

Operating System:
ALL
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

I've been having a problem all day with my MongoDB cluster getting stuck while migrating chunks.

This is a 8 shard cluster, MongoDB 2.4.6, with each shard a 5 member replica set. I have 2 shards both trying to migrate chunks:

first primary:

Feb 25 20:46:55 terranova mongod.10001[19778]: Tue Feb 25 20:46:55.986 [conn4194] command admin.$cmd command: { moveChunk: "current.reviews", from: "rs1/mongors1-2.redacted.com:10001,mongors1-3.redacted.com:10001,terranova.redacted.com:10001", to: "rs3/bastoni.redacted.com:10003,mongors3-2.redacted.com:10003,mongors3-3.redacted.com:10003", fromShard: "rs1", toShard: "rs3", min:

{ location_id: ObjectId('52a75b4738fc9d23e88ab516') }

, max:

{ location_id: ObjectId('52a75b4814b0f61b64acffc8') }

, maxChunkSizeBytes: 67108864, shardId: "current.reviews-location_id_ObjectId('52a75b4738fc9d23e88ab516')", configdb: "terranova.redacted.com:30000,giordano.redacted.com:30000,bastoni.redacted.com:30000", secondaryThrottle: true, waitForDelete: false } ntoreturn:1 keyUpdates:0 locks(micros) W:3854 r:85284 reslen:343 559ms

8th primary:

Feb 25 20:47:40 MongoRS8-1 mongod.10008[31876]: Tue Feb 25 20:47:40.430 [conn1993] received moveChunk request: { moveChunk: "current.citations", from: "rs8/mongors8-1.redacted.com:10008,mongors8-2.redacted.com:10008,mongors8-3.redacted.com:10008", to: "rs4/mongors4-1.redacted.com:10004,mongors4-2.redacted.com:10004,mongors4-3.redacted.com:10004", fromShard: "rs8", toShard: "rs4", min:

{ location_id: ObjectId('4f2703f0bc0f367032000000') }

, max:

{ location_id: ObjectId('4f3533ffbc0f36d419000002') }

, maxChunkSizeBytes: 67108864, shardId: "current.citations-location_id_ObjectId('4f2703f0bc0f367032000000')", configdb: "terranova.redacted.com:30000,giordano.redacted.com:30000,bastoni.redacted.com:30000", secondaryThrottle: true, waitForDelete: false }

On both primaries I am seeing:

warning: moveChunk failed to engage TO-shard in the data transfer: still waiting for a previous migrates data to get cleaned, can't accept new chunks, num threads: 39

In the last 6 hours, 3 chunks have managed to migrate.

I have bounced mongos's, bounced mongod's. Stopped the balancer, removed the {_id: "balancer"} lock from the config db, bounced the mongos's, reenabled the balancer.

Not seeing anything out of the ordinary on the receivers.

Strange thing, is that 3 chunks have been able to migrate amidst this issue. The errors disappear, there is some chunk migration log messages, it finished, and then the errors start up again.

Assignee:: Unassigned
Reporter:: Eric Coutu
Participants:: Asya Kamsky, Eric Coutu
Votes:: 0 Vote for this issue
Watchers:: 4 Start watching this issue

Created:: Feb 25 2014 10:46:17 PM UTC
Updated:: Sep 26 2014 04:57:57 PM UTC
Resolved:: Sep 26 2014 04:57:57 PM UTC

Details

Description

Attachments

Activity

People

Dates