[SERVER-8658] moveChunk error: "Invalid modifier specified: _id" (mongo 2.2.3) Created: 21/Feb/13  Updated: 07/Mar/14  Resolved: 20/Mar/13

Status: Closed
Project: Core Server
Component/s: Write Ops
Affects Version/s: 2.2.3
Fix Version/s: None

Type: Bug Priority: Minor - P4
Reporter: Alex Piggott Assignee: Unassigned
Resolution: Cannot Reproduce Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Centos 5.5/6.3, driver 2.7.3


Issue Links:
Duplicate
Operating System: ALL
Steps To Reproduce:

NA

Participants:

 Description   

From this thread: https://groups.google.com/forum/?hl=en&fromgroups=#!topic/mongodb-user/i8Te7ip2T-k

I have had a sharded DB with a single shard for a while. When I added a second shard on Monday, 3 of the chunks failed to migrate - and they are in a continuous loop of starting migrating, fail (invalid modified specified) every few seconds.

(In addition the chunk mentioned in the above thread, there are 2 other chunks that behave the same way except that they don't generate any errors, they just abort in step 2 without explanation)

gist with some more relevant logging (starting from when I added the second shard): https://gist.github.com/Alex-Ikanow/5008868



 Comments   
Comment by Alex Piggott [ 13/Mar/13 ]

I was leaving my DB in this state, in case anyone wanted to come and have a look. Unfortunately over the weekend (it's always over the weekend!) things worsened and the DB started not responding to find() commands with errors like:

com.mongodb.CommandResult$CommandFailure: command failed [command failed [count] { "serverUsed" : "localhost:27017" , "errmsg" : "exception: DBClientBase::findN: transport error: 10.60.18.179:27019 ns: admin.$cmd query: { setShardVersion: \"doc_metadata.metadata\", configdb: \"demo-db-config-1.rr.ikanow.com:27016,demo-db-config-2.rr.ikanow.com:27016,demo-db-config-3.rr.ikanow.com:27016\", version: Timestamp 232000|0, versionEpoch: ObjectId('000000000000000000000000'), serverID: ObjectId('511bcacc92a322bca264c8db'), shard: \"replica_set2\", shardHost: \"replica_set2/10.226.114.59:27019,10.60.18.179:27019\", $auth: {} }" , "code" : 10276 , "ok" : 0.0}

It also deleted one of the collections that was returning the original moveChunk errors, and then complained:

Sat Mar 9 10:32:33 [Balancer] moveChunk result:

{ errmsg: "ns not found, should be impossible", ok: 0.0 }

Sat Mar 9 10:32:33 [Balancer] balancer move failed:

{ errmsg: "ns not found, should be impossible", ok: 0.0 }

from: replica_set1 to: replica_set2 chunk: min:

{ _id: MinKey }

max:

{ _id: MinKey }

(cheeky!)

Anyway, I fixed it by deleting all the collections that had ever returned those errors and then using mongoimport via mongos to bring them back in again.

So I suppose you can close this issue, I'm a bit surprised nobody stopped by!

Comment by Alex Piggott [ 21/Feb/13 ]

(I should add that my applications were writing data to the database as I added the second shard, if this isn't supported - I didn't see that documented anywhere - then apologies and help how do I recover?!)

Generated at Thu Feb 08 03:18:00 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.