Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-4055

Assertion error on compacting a non existent collection

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 2.0.0
    • Component/s: Index Maintenance, Storage
    • Labels:
      None
    • Environment:
      mongo 2.0.0 / 64bits on CentOS 5.5. 2 shard 3 nodes per shard. 1 delayed node per shard. 1 additional arbiter per shard
    • Linux

      In the process of upgrading to v2.0.0 i were compacting collections on one of one shard secondaries.
      I accidentally issued a 'db.calendar.runCommand("compact")' on the default test database.
      The shell changed prompt from secondary to recovering, and stood there.

      Looking on the logs I saw this:

      Tue Oct 11 17:35:19 [conn44] replSet going into maintenance mode (0 other tasks)
      Tue Oct 11 17:35:19 [conn44] replSet RECOVERING
      Tue Oct 11 17:35:19 [conn44] Assertion: 13660:namespace test.calendar does not exist
      0x587512 0xa9be43 0xa9c647 0x973b49 0x97512f 0x95d725 0x9607b4 0x87e037 0x88485c 0xa96a46 0x635dd7 0x30e920673d 0x30e8ad3f6d
      /opt/mongo/bin/mongod(_ZN5mongo11msgassertedEiPKc+0x112) [0x587512]
      /opt/mongo/bin/mongod(_ZN5mongo7compactERKSsRSsbRNS_14BSONObjBuilderE+0x603) [0xa9be43]
      /opt/mongo/bin/mongod(_ZN5mongo10CompactCmd3runERKSsRNS_7BSONObjEiRSsRNS_14BSONObjBuilderEb+0x267) [0xa9c647]
      /opt/mongo/bin/mongod(_ZN5mongo11execCommandEPNS_7CommandERNS_6ClientEiPKcRNS_7BSONObjERNS_14BSONObjBuilderEb+0x6a9) [0x973b49]
      /opt/mongo/bin/mongod(_ZN5mongo12_runCommandsEPKcRNS_7BSONObjERNS_11_BufBuilderINS_16TrivialAllocatorEEERNS_14BSONObjBuilderEbi+0x6ff) [0x97512f]
      /opt/mongo/bin/mongod(_ZN5mongo11runCommandsEPKcRNS_7BSONObjERNS_5CurOpERNS_11_BufBuilderINS_16TrivialAllocatorEEERNS_14BSONObjBuilderEbi+0x35) [0x95d725]
      /opt/mongo/bin/mongod(ZN5mongo8runQueryERNS_7MessageERNS_12QueryMessageERNS_5CurOpES1+0xee4) [0x9607b4]
      /opt/mongo/bin/mongod [0x87e037]
      /opt/mongo/bin/mongod(_ZN5mongo16assembleResponseERNS_7MessageERNS_10DbResponseERKNS_11HostAndPortE+0x55c) [0x88485c]
      /opt/mongo/bin/mongod(_ZN5mongo16MyMessageHandler7processERNS_7MessageEPNS_21AbstractMessagingPortEPNS_9LastErrorE+0x76) [0xa96a46]
      /opt/mongo/bin/mongod(_ZN5mongo3pms9threadRunEPNS_13MessagingPortE+0x287) [0x635dd7]
      /lib64/libpthread.so.0 [0x30e920673d]
      /lib64/libc.so.6(clone+0x6d) [0x30e8ad3f6d]
      Tue Oct 11 17:35:19 [conn35] assertion 13436 not master or secondary, can't read ns:crawler4.ad query:

      { SITE_ID: 403064, URL_UNIQUE: "11565" }

      Tue Oct 11 17:35:19 [conn35] ntoskip:0 ntoreturn:-1
      Tue Oct 11 17:35:19 [conn35] end connection 83.149.71.144:46588
      Tue Oct 11 17:35:19 [initandlisten] connection accepted from 83.149.71.144:57778 #45
      Tue Oct 11 17:35:19 [conn45] assertion 13436 not master or secondary, can't read ns:crawler4.ad query:

      { SITE_ID: 403064, URL_UNIQUE: "11821" }

      Tue Oct 11 17:35:19 [conn45] ntoskip:0 ntoreturn:-1
      Tue Oct 11 17:35:19 [conn45] end connection 83.149.71.144:57778
      Tue Oct 11 17:35:19 [conn28] end connection 83.149.71.144:46581
      Tue Oct 11 17:35:19 [conn30] assertion 13436 not master or secondary, can't read ns:crawler4.ad query:

      { SITE_ID: 403064, URL_UNIQUE: "11566" }

      Tue Oct 11 17:35:19 [conn30] ntoskip:0 ntoreturn:-1
      Tue Oct 11 17:35:19 [conn30] end connection 83.149.71.144:46583
      Tue Oct 11 17:35:19 [initandlisten] connection accepted from 83.149.71.144:57783 #46
      Tue Oct 11 17:35:19 [conn29] assertion 13436 not master or secondary, can't read ns:crawler4.ad query:

      { SITE_ID: 553, URL_UNIQUE: "ref/2463-052A/pr/15" }

      Tue Oct 11 17:35:19 [conn29] ntoskip:0 ntoreturn:-1
      Tue Oct 11 17:35:19 [conn34] assertion 13436 not master or secondary, can't read ns:crawler4.ad query:

      { SITE_ID: 595, URL_UNIQUE: "VP0000003247136" }

      Tue Oct 11 17:35:19 [conn34] ntoskip:0 ntoreturn:-1
      Tue Oct 11 17:35:19 [conn31] assertion 13436 not master or secondary, can't read ns:crawler4.ad query:

      { SITE_ID: 1861, URL_UNIQUE: "10435611" }

      Tue Oct 11 17:35:19 [conn31] ntoskip:0 ntoreturn:-1
      Tue Oct 11 17:35:19 [conn46] assertion 13436 not master or secondary, can't read ns:crawler4.ad query:

      { SITE_ID: 588, URL_UNIQUE: "2369189" }

      Tue Oct 11 17:35:19 [conn46] ntoskip:0 ntoreturn:-1
      Tue Oct 11 17:35:19 [conn46] end connection 83.149.71.144:57783
      Tue Oct 11 17:35:19 [conn31] end connection 83.149.71.144:46584
      Tue Oct 11 17:35:19 [conn29] end connection 83.149.71.144:46582
      Tue Oct 11 17:35:19 [conn34] end connection 83.149.71.144:46587
      Tue Oct 11 17:35:19 [conn36] end connection 83.149.71.144:46589
      Tue Oct 11 17:35:19 [conn27] end connection 83.149.71.144:46580
      Tue Oct 11 17:35:19 [conn33] end connection 83.149.71.144:46586
      Tue Oct 11 17:35:19 [conn32] assertion 13436 not master or secondary, can't read ns:crawler4.ad query:{ SIT...

      This rendered the node unusable.
      I re-issued the command on the correct database and everything looks like going fine (The collection is being compacted).
      I hope the node will go back to secondary afterwards, I'll keep you posted.

      Doesn't look like a ugly bug, but as it leds the node on RECOVERING state, I think is bad enough.

            Assignee:
            brandon Brandon Diamond
            Reporter:
            mgbuddy@gmail.com Marc Gràcia
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: