Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-19367

Segfault establishing shard connection in Chunk::splitMulti

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • 3.1.7
    • Affects Version/s: None
    • Component/s: Sharding
    • Labels:
      None
    • Fully Compatible
    • ALL
    • Sharding 6 07/17/15, Sharding 7 08/10/15
    • 0

      Observed incidentally in an Evergreen patch run (unrelated to the patch, since the test passed fine on resubmission) in geo_shardedgeonear.js. The crash is in the setup phase of the test, and is before any geo-related stuff.

       m30002| 2015-07-10T08:17:57.981+0000 I COMMAND  [conn4] command admin.$cmd command: _recvChunkCommit { _recvChunkCommit: 1 } ntoreturn:1 ntoskip:0 keyUpdates:0 writeConflicts:0 numYields:0 reslen:258 locks:{} protocol:op_command 347ms
       m30001| 2015-07-10T08:17:57.981+0000 I SHARDING [conn3] moveChunk migrate commit accepted by TO-shard: { active: false, ns: "test.points", from: "ip-10-187-48-99:30001", min: { rand: 0.2 }, max: { rand: MaxKey }, shardKeyPattern: { rand: 1.0 }, state: "done", counts: { cloned: 0, clonedBytes: 0, catchup: 0, steady: 0 }, ok: 1.0 }
       m30001| 2015-07-10T08:17:57.981+0000 I SHARDING [conn3] moveChunk updating self version to: 2|1||559f7fb1e2417ab851d9c307 through { rand: MinKey } -> { rand: 0.1 } for collection 'test.points'
       m30001| 2015-07-10T08:17:58.316+0000 I SHARDING [conn3] about to log metadata event: { _id: "ip-10-187-48-99-2015-07-10T08:17:58.316+0000-559f7fb64b31329a4e39d3ab", server: "ip-10-187-48-99", clientAddr: "10.187.48.99:37799", time: new Date(1436516278316), what: "moveChunk.commit", ns: "test.points", details: { min: { rand: 0.2 }, max: { rand: MaxKey }, from: "shard0001", to: "shard0002", cloned: 0, clonedBytes: 0, catchup: 0, steady: 0 } }
       m30001| 2015-07-10T08:17:58.372+0000 I SHARDING [conn3] MigrateFromStatus::done About to acquire global lock to exit critical section
       m30001| 2015-07-10T08:17:58.372+0000 I SHARDING [conn3] forking for cleanup of chunk data
       m30001| 2015-07-10T08:17:58.372+0000 I SHARDING [conn3] MigrateFromStatus::done About to acquire global lock to exit critical section
       m30001| 2015-07-10T08:17:58.373+0000 I SHARDING [RangeDeleter] Deleter starting delete for: test.points from { rand: 0.2 } -> { rand: MaxKey }, with opId: 40
       m30001| 2015-07-10T08:17:58.373+0000 I SHARDING [RangeDeleter] rangeDeleter deleted 0 documents for test.points from { rand: 0.2 } -> { rand: MaxKey }
       m30001| 2015-07-10T08:17:58.708+0000 I SHARDING [conn3] distributed lock 'test.points/ip-10-187-48-99:30001:1436516273:1905792156' unlocked.
       m30001| 2015-07-10T08:17:58.708+0000 I SHARDING [conn3] about to log metadata event: { _id: "ip-10-187-48-99-2015-07-10T08:17:58.708+0000-559f7fb64b31329a4e39d3ac", server: "ip-10-187-48-99", clientAddr: "10.187.48.99:37799", time: new Date(1436516278708), what: "moveChunk.from", ns: "test.points", details: { min: { rand: 0.2 }, max: { rand: MaxKey }, step 1 of 6: 0, step 2 of 6: 731, step 3 of 6: 5, step 4 of 6: 13, step 5 of 6: 740, step 6 of 6: 0, to: "shard0002", from: "shard0001", note: "success" } }
       m30001| 2015-07-10T08:17:58.765+0000 I COMMAND  [conn3] command test.points command: moveChunk { moveChunk: "test.points", from: "ip-10-187-48-99:30001", to: "ip-10-187-48-99:30002", fromShard: "shard0001", toShard: "shard0002", min: { rand: 0.2 }, max: { rand: MaxKey }, maxChunkSizeBytes: 52428800, configdb: "ip-10-187-48-99:29000,ip-10-187-48-99:29001,ip-10-187-48-99:29002", secondaryThrottle: true, waitForDelete: false, maxTimeMS: 0, epoch: ObjectId('559f7fb1e2417ab851d9c307') } ntoreturn:1 ntoskip:0 keyUpdates:0 writeConflicts:0 numYields:0 reslen:22 locks:{ Global: { acquireCount: { r: 9, w: 2, R: 3 } }, Database: { acquireCount: { r: 2, w: 2 } }, Collection: { acquireCount: { r: 2, W: 2 } } } protocol:op_command 1883ms
       m30999| 2015-07-10T08:17:58.767+0000 I SHARDING [conn1] ChunkManager: time to load chunks for test.points: 0ms sequenceNumber: 6 version: 2|1||559f7fb1e2417ab851d9c307 based on: 1|4||559f7fb1e2417ab851d9c307
       m30999| 2015-07-10T08:17:58.767+0000 I COMMAND  [conn1] splitting chunk [{ rand: 0.2 },{ rand: MaxKey }) in collection test.points on shard shard0002
       m30999| 2015-07-10T08:17:58.768+0000 F -        [conn1] Invalid access at address: 0xffffffffffffffe8
       m30999| 2015-07-10T08:17:58.773+0000 F -        [conn1] Got signal: 11 (Segmentation fault).
       m30999|
       m30999| ----- BEGIN BACKTRACE -----
       m30999|  mongos(mongo::printStackTrace(std::ostream&) 0x32) [0xb149b2]
       m30999|  mongos( 0x713999) [0xb13999]
       m30999|  mongos( 0x713EC8) [0xb13ec8]
       m30999|  libpthread.so.0( 0xECA0) [0x2af3b53e0ca0]
       m30999|  mongos(mongo::DBClientConnection::connectSocketOnly(mongo::HostAndPort const&) 0x1F9) [0x641b99]
       m30999|  mongos(mongo::DBClientConnection::connect(mongo::HostAndPort const&) 0x26) [0x642186]
       m30999|  mongos(mongo::DBClientConnection::connect(mongo::HostAndPort const&, std::string&) 0x20) [0x642710]
       m30999|  mongos(mongo::ConnectionString::connect(std::string&, double) const 0x3B9) [0x6322e9]
       m30999|  mongos(mongo::DBConnectionPool::get(mongo::ConnectionString const&, double) 0x76) [0x634106]
       m30999|  mongos(mongo::ScopedDbConnection::ScopedDbConnection(mongo::ConnectionString const&, double) 0x65) [0x634345]
       m30999|  mongos(mongo::Chunk::multiSplit(std::vector<mongo::BSONObj, std::allocator<mongo::BSONObj> > const&, mongo::BSONObj*) const 0xC8) [0x9bf3d8]
       m30999|  mongos( 0x62E5F3) [0xa2e5f3]
       m30999|  mongos(mongo::Command::execCommandClientBasic(mongo::OperationContext*, mongo::Command*, mongo::ClientBasic&, int, char const*, mongo::BSONObj&, mongo::BSONObjBuilder&) 0x701) [0xa5bc91]
       m30999|  mongos(mongo::Command::runAgainstRegistered(char const*, mongo::BSONObj&, mongo::BSONObjBuilder&, int) 0x2E0) [0xa5c960]
       m30999|  mongos(mongo::Strategy::clientCommandOp(mongo::Request&) 0x1C9) [0xa65859]
       m30999|  mongos(mongo::Request::process(int) 0x615) [0xa5b045]
       m30999|  mongos(mongo::ShardedMessageHandler::process(mongo::Message&, mongo::AbstractMessagingPort*) 0x40) [0x5df6c0]
       m30999|  mongos(mongo::PortMessageServer::handleIncomingMsg(void*) 0x265) [0xacf185]
       m30999|  libpthread.so.0( 0x683D) [0x2af3b53d883d]
       m30999|  libc.so.6(clone 0x6D) [0x2af3b56c3fcd]
       m30999| -----  END BACKTRACE  -----
      ...
      2015-07-10T08:17:58.786+0000 E QUERY    [main] Error: error doing query: failed
          at DB.runCommand (src/mongo/shell/db.js:124:20)
          at DB.adminCommand (src/mongo/shell/db.js:138:41)
          at test (jstests/sharding/geo_shardedgeonear.js:17:23)
          at jstests/sharding/geo_shardedgeonear.js:47:1 at src/mongo/shell/db.js:124
      failed to load: jstests/sharding/geo_shardedgeonear.js
      

            Assignee:
            kaloian.manassiev@mongodb.com Kaloian Manassiev
            Reporter:
            kevin.pulo@mongodb.com Kevin Pulo
            Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

              Created:
              Updated:
              Resolved: