Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-569

better invalid object debugging (WAS: 1.1.3 -> 1.2.1 replica pair (slave) initial cloning fails)

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • 1.2.2, 1.3.2
    • Affects Version/s: 1.2.1
    • Component/s: Replication
    • Labels:
      None
    • Environment:
      Linux ec2-xxx-xxx-xxx-xxx.compute-1.amazonaws.com 2.6.21.7-2.fc8xen #1 SMP Fri Feb 15 12:34:28 EST 2008 x86_64 x86_64 x86_64 GNU/Linux

      Trying to upgrade our replica pairs to 1.2.1, i started from a fresh slave (no initial data) running with the following command line:

      /opt/mongodb-linux-x86_64-1.2.1/bin/mongod --dbpath=/mongo/data --nssize 160 --noauth --pairwith=production-shard1-001

      After a while (about 30GB cloned), the slave issue the following message:

      Tue Jan 26 09:38:12 Assertion: Invalid dbref/code/string/symbol size
      skipping corrupt object from production.messages_dxxxxxxxxxxxxxxxxxx
      Tue Jan 26 09:38:33 invalid object size: 11031214
      Tue Jan 26 09:38:33 Assertion: Invalid BSONObj spec size
      Tue Jan 26 09:38:33 repl: AssertionException Invalid BSONObj spec size
      Tue Jan 26 09:38:33 repl: sleep 2sec before next pass

      Given the delay between the first message (38:12) and the next one (38:33), i'm not even sure the object that cause the error is in this collection. I thought that 1.2.1 would report this error including the _id of the culprit and just go on ... this is not the case, strace on the mongod process shows that it is just sitting there on a wait4 call:

      Process 8175 attached - interrupt to quit
      wait4(-1,
      ...

      As 1.1.3 doesn't have the newly introduced bsonsize() api call, how can i identify and get rid of those invalid objects in the master's database ?

            Assignee:
            eliot Eliot Horowitz (Inactive)
            Reporter:
            erwan Erwan Arzur
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

              Created:
              Updated:
              Resolved: