Note: commit message is wrong. Forgot to change it before pushing.
The fix for SERVER-11277 (99dff054c8b8) seems to cause moveChunk commands to fail with a transport error under certain circumstances. The attached JS file reproduces the problem.
m30999| 2013-12-20T14:12:56.315-0500 [WriteBackListener-localhost:30000] DBClientCursor::init call() failed m30000| 2013-12-20T14:12:56.315-0500 [conn1] end connection 127.0.0.1:50886 (4 connections now open) m30000| 2013-12-20T14:12:56.315-0500 [conn3] end connection 127.0.0.1:50899 (4 connections now open) m30000| 2013-12-20T14:12:56.315-0500 [conn5] end connection 127.0.0.1:50910 (3 connections now open) m30001| 2013-12-20T14:12:56.315-0500 [conn5] end connection 127.0.0.1:50908 (4 connections now open) m30999| 2013-12-20T14:12:56.315-0500 [WriteBackListener-localhost:30000] Detected bad connection created at 1387566773596630 microSec, clearing pool for localhost:30000 of 0 connections m30999| 2013-12-20T14:12:56.315-0500 [conn2] DBClientCursor::init call() failed m30999| 2013-12-20T14:12:56.315-0500 [WriteBackListener-localhost:30000] WriteBackListener exception : DBClientBase::findN: transport error: localhost:30000 ns: admin.$cmd query: { writebacklisten: ObjectId('52b496b5ebc1242f136c7597') } m30999| 2013-12-20T14:12:56.315-0500 [conn2] Detected bad connection created at 1387566773604578 microSec, clearing pool for localhost:30000 of 0 connections sh81742| { sh81742| "code" : 10276, sh81742| "ok" : 0, sh81742| "errmsg" : "exception: DBClientBase::findN: transport error: localhost:30000 ns: admin.$cmd query: { moveChunk: \"foo.bar\", from: \"localhost:30000\", to: \"localhost:30001\", fromShard: \"shard0000\", toShard: \"shard0001\", min: { _id: 0.0 }, max: { _id: 20.0 }, maxChunkSizeBytes: 52428800, shardId: \"foo.bar-_id_0.0\", configdb: \"localhost:29000\", secondaryThrottle: false, waitForDelete: true, maxTimeMS: 0 }" sh81742| } sh81742| assert failed m30001| 2013-12-20T14:12:56.317-0500 [migrateThread] DBClientCursor::init call() failed
Versions tested (chronological order):
6902c6b643f64 (not reproducible)
99dff054c8b8 (when the behavior change was introduced)
77384d0a36a2 (recent commit from 12-20-2013)
- is related to
-
SERVER-11277 cleanupOrphaned does nothing on empty shard
- Closed