Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-32677

Segmentation fault converting ReplicaSet to Replicated Shard Cluster

    XMLWordPrintable

    Details

    • Backwards Compatibility:
      Fully Compatible
    • Operating System:
      ALL
    • Steps To Reproduce:
      Hide

      start 2 data node with: (config attached)

      $ mongod --config /data/mongod.conf --replSet rs
      start the arbiter with:
      $ mongod --config /data/arbitrer.conf --replSet rs
      

      connect to one data node and run the replicaset init

      > rs.initiate({ _id: "rs", members: [{ _id: 1, host: "mongo_replica1:27017" }, { _id: 2, host: "mongo_replica2:27017" }], settings: { getLastErrorDefaults: { w: "majority", wtimeout: 30000 }}})
      

      connecting to replicaset "rs/mongo_replica1:27017,mongo_replica2:27017" to add arbitrer

      > rs.addArb("mongo_arbitrer:27017")
      

      now following https://docs.mongodb.com/manual/tutorial/convert-replica-set-to-replicated-shard-cluster/#restart-the-replica-set-as-a-shard

      stop secondary and run

      $ mongod --config /data/mongod.conf --shardsvr --replSet rs
      stop arbitrer and run
      $ mongod --config /data/arbitrer.conf --shardsvr --replSet rs
      

      connect to primary and stepDown

      > rs.stepDown()
      

      restart old Primary with

      $ mongod --config /data/mongod.conf --shardsvr --replSet rs
      

      everything reconnect. After some minutes (around 5. it's cyclic) in idle one data node receive SIGSEGV and on cascade also the other data node (but not the arbitrer) receive the same SIGSEGV.

      Show
      start 2 data node with: (config attached) $ mongod --config /data/mongod.conf --replSet rs start the arbiter with: $ mongod --config /data/arbitrer.conf --replSet rs connect to one data node and run the replicaset init > rs.initiate({ _id: "rs", members: [{ _id: 1, host: "mongo_replica1:27017" }, { _id: 2, host: "mongo_replica2:27017" }], settings: { getLastErrorDefaults: { w: "majority", wtimeout: 30000 }}}) connecting to replicaset "rs/mongo_replica1:27017,mongo_replica2:27017" to add arbitrer > rs.addArb("mongo_arbitrer:27017") now following https://docs.mongodb.com/manual/tutorial/convert-replica-set-to-replicated-shard-cluster/#restart-the-replica-set-as-a-shard stop secondary and run $ mongod --config /data/mongod.conf --shardsvr --replSet rs stop arbitrer and run $ mongod --config /data/arbitrer.conf --shardsvr --replSet rs connect to primary and stepDown > rs.stepDown() restart old Primary with $ mongod --config /data/mongod.conf --shardsvr --replSet rs everything reconnect. After some minutes (around 5. it's cyclic) in idle one data node receive SIGSEGV and on cascade also the other data node (but not the arbitrer) receive the same SIGSEGV.
    • Sprint:
      Sharding 2018-02-12, Sharding 2018-02-26, Sharding 2018-03-12, Sharding 2018-03-26, Sharding 2018-04-23
    • Case:

      Description

      The starting point are 3 node: 2 data barer and 1 arbitrer. All nodes started without the flag --shardsvr. Once the replicaset is initialized (initialization + addArb) it cannot be converted to a replicated shard.

      Once you restart the node with the flag --shardsvr they, after a while, access a bad memory segment (Invalid access at address: 0x18) and receive signal SIGSEGV.

      If restarted again they continue to receive the same signal after some time in idle (more or less 5 minute)

        Attachments

        1. arbitrer.conf
          0.3 kB
        2. mongo_primary.log
          11 kB
        3. mongo_secondary.log
          7 kB
        4. mongod.conf
          0.3 kB

          Issue Links

            Activity

              People

              • Votes:
                0 Vote for this issue
                Watchers:
                20 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: