-
Type: Bug
-
Resolution: Done
-
Priority: Major - P3
-
Affects Version/s: 1.2.1
-
Component/s: Replication
-
None
-
Environment:Linux ec2-xxx-xxx-xxx-xxx.compute-1.amazonaws.com 2.6.21.7-2.fc8xen #1 SMP Fri Feb 15 12:34:28 EST 2008 x86_64 x86_64 x86_64 GNU/Linux
Trying to upgrade our replica pairs to 1.2.1, i started from a fresh slave (no initial data) running with the following command line:
/opt/mongodb-linux-x86_64-1.2.1/bin/mongod --dbpath=/mongo/data --nssize 160 --noauth --pairwith=production-shard1-001
After a while (about 30GB cloned), the slave issue the following message:
Tue Jan 26 09:38:12 Assertion: Invalid dbref/code/string/symbol size
skipping corrupt object from production.messages_dxxxxxxxxxxxxxxxxxx
Tue Jan 26 09:38:33 invalid object size: 11031214
Tue Jan 26 09:38:33 Assertion: Invalid BSONObj spec size
Tue Jan 26 09:38:33 repl: AssertionException Invalid BSONObj spec size
Tue Jan 26 09:38:33 repl: sleep 2sec before next pass
Given the delay between the first message (38:12) and the next one (38:33), i'm not even sure the object that cause the error is in this collection. I thought that 1.2.1 would report this error including the _id of the culprit and just go on ... this is not the case, strace on the mongod process shows that it is just sitting there on a wait4 call:
Process 8175 attached - interrupt to quit
wait4(-1,
...
As 1.1.3 doesn't have the newly introduced bsonsize() api call, how can i identify and get rid of those invalid objects in the master's database ?