Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-1719

mongos process died with splitIfShould failed: medianKey command failed

    XMLWordPrintableJSON

Details

    • Icon: Bug Bug
    • Resolution: Cannot Reproduce
    • Icon: Minor - P4 Minor - P4
    • None
    • 1.6.1
    • None
    • None
    • ec2 / ubuntu
    • Linux

    Description

      Problem:
      Running a multi-shard system (2 shards, 2 members). After loading data, I wanted to drop the daya in the collection

      > use scaleout
      switched to db scaleout
      > db.blogs.drop()
      true

      When I tries to re-connect to the mongos I got the following

      vero:scripts$ ../../software/mongodb-osx-x86_64-1.6.0/bin/mongo --port 27500 -host ec2-184-73-98-241.compute-1.amazonaws.com
      MongoDB shell version: 1.6.0
      connecting to: ec2-184-73-98-241.compute-1.amazonaws.com:27500/test
      Wed Sep 1 14:10:32 Error: couldn't connect to server ec2-184-73-98-241.compute-1.amazonaws.com:27500} (anon):1139
      exception: connect failed

      Looking at the node running mongos, the process was no longer running. I saw the following in the logs

      ed Sep 1 21:01:10 [WriteBackListener] config change: { _id: "ip-10-196-174-191-2010-09-01T21:01:10-25", server: "ip-10-196-174-191", time: new Date(1283374870915), what: "split", ns: "scaleout.blogs", details: { before: { min:

      { ts: 26140129 }

      , max:

      { ts: 30278795 }

      }, left: { min:

      { ts: 26140129 }

      , max:

      { ts: 27936982 }

      }, right: { min:

      { ts: 27936982 }

      , max:

      { ts: 30278795 }

      } } }
      Wed Sep 1 21:04:23 [conn2] DROP: scaleout.blogs
      Wed Sep 1 21:04:23 [conn2] config change: { _id: "ip-10-196-174-191-2010-09-01T21:04:23-26", server: "ip-10-196-174-191", time: new Date(1283375063464), what: "dropCollection.start", ns: "scaleout.blogs", details: {} }
      Wed Sep 1 21:06:01 [conn2] config change: { _id: "ip-10-196-174-191-2010-09-01T21:06:01-27", server: "ip-10-196-174-191", time: new Date(1283375161378), what: "dropCollection", ns: "scaleout.blogs", details: {} }
      Wed Sep 1 21:06:01 [conn8] ChunkManager: couldn't find chunk for:

      { ts: 26631861 }

      going to retry
      Wed Sep 1 21:06:01 [conn8] DBException in process: couldn't find a chunk aftry retry which should be impossible extracted:

      { ts: 26631861 }

      Wed Sep 1 21:06:01 [WriteBackListener] ~ScopedDBConnection: _conn != null
      Wed Sep 1 21:06:01 [WriteBackListener] ERROR: splitIfShould failed: medianKey command failed: { errmsg: "no index found for specified keyPattern:

      { ts: 1.0 }

      ", ok: 0.0 }

      Log from shard000
      Wed Sep 1 20:57:31 [conn3] query admin.$cmd ntoreturn:1 command:

      { serverStatus: 1 }

      reslen:856 109ms
      Wed Sep 1 21:07:11 [rs_sync] CMD: drop scaleout.blogs
      Wed Sep 1 21:07:11 [rs_sync] building new index on

      { _id: 1 }

      for scaleout.blogs
      Wed Sep 1 21:07:11 [rs_sync] Buildindex scaleout.blogs idxNo:0 { name: "id", ns: "scaleout.blogs", key:

      { _id: 1 }

      }
      Wed Sep 1 21:07:11 [rs_sync] done for 0 records 0secs
      Wed Sep 1 21:10:41 [conn3] end connection 127.0.0.1:42059

      Log from shard001
      Wed Sep 1 20:50:17 [conn5] query admin.$cmd ntoreturn:1 command:

      { serverStatus: 1 }

      reslen:856 443ms
      Wed Sep 1 20:52:17 [conn5] query admin.$cmd ntoreturn:1 command:

      { serverStatus: 1 }

      reslen:856 121ms
      Wed Sep 1 20:58:35 allocating new datafile /var/lib/mongodb/replset1/scaleout.7, filling with zeroes...
      Wed Sep 1 20:58:35 done allocating datafile /var/lib/mongodb/replset1/scaleout.7, size: 2047MB, took 0.01 secs
      Wed Sep 1 20:59:17 [conn5] query admin.$cmd ntoreturn:1 command:

      { serverStatus: 1 }

      reslen:856 126ms
      Wed Sep 1 21:05:59 [rs_sync] CMD: drop scaleout.blogs
      Wed Sep 1 21:11:14 [conn5] end connection 127.0.0.1:43086

      Reproduce:
      This is what I did
      – loaded both shards with data
      – performed the db.blogs.drop()

      Does not look like an out of disk space problem.

      Workaround:
      Restart mongos

      Business Case:
      Reliability

      Attachments

        Activity

          People

            eliot Eliot Horowitz (Inactive)
            alvin Alvin Richards (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: