Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Fixed
Priority: Major - P3
Fix Version/s: 3.6.4, 3.7.6
Affects Version/s: 3.6.1
Component/s: Sharding
Labels:
- neweng
Environment:

Hide
docker container mongo:3.6.1 (debian jessy)

https://github.com/docker-library/mongo/blob/657b1a53a9680b972a6344f3d958a17775dd8719/3.6/Dockerfile

Show
docker container mongo:3.6.1 (debian jessy) https://github.com/docker-library/mongo/blob/657b1a53a9680b972a6344f3d958a17775dd8719/3.6/Dockerfile

Backwards Compatibility:
Fully Compatible
Operating System:
ALL
Steps To Reproduce:
Hide

start 2 data node with: (config attached)

$ mongod --config /data/mongod.conf --replSet rs start the arbiter with: $ mongod --config /data/arbitrer.conf --replSet rs

connect to one data node and run the replicaset init

> rs.initiate({ _id: "rs", members: [{ _id: 1, host: "mongo_replica1:27017" }, { _id: 2, host: "mongo_replica2:27017" }], settings: { getLastErrorDefaults: { w: "majority", wtimeout: 30000 }}})

connecting to replicaset "rs/mongo_replica1:27017,mongo_replica2:27017" to add arbitrer

> rs.addArb("mongo_arbitrer:27017")

now following https://docs.mongodb.com/manual/tutorial/convert-replica-set-to-replicated-shard-cluster/#restart-the-replica-set-as-a-shard

stop secondary and run

$ mongod --config /data/mongod.conf --shardsvr --replSet rs stop arbitrer and run $ mongod --config /data/arbitrer.conf --shardsvr --replSet rs

connect to primary and stepDown

> rs.stepDown()

restart old Primary with

$ mongod --config /data/mongod.conf --shardsvr --replSet rs

everything reconnect. After some minutes (around 5. it's cyclic) in idle one data node receive SIGSEGV and on cascade also the other data node (but not the arbitrer) receive the same SIGSEGV.
Show
start 2 data node with: (config attached) $ mongod --config /data/mongod.conf --replSet rs start the arbiter with: $ mongod --config /data/arbitrer.conf --replSet rs connect to one data node and run the replicaset init > rs.initiate({ _id: "rs" , members: [{ _id: 1, host: "mongo_replica1:27017" }, { _id: 2, host: "mongo_replica2:27017" }], settings: { getLastErrorDefaults: { w: "majority" , wtimeout: 30000 }}}) connecting to replicaset "rs/mongo_replica1:27017,mongo_replica2:27017" to add arbitrer > rs.addArb( "mongo_arbitrer:27017" ) now following https://docs.mongodb.com/manual/tutorial/convert-replica-set-to-replicated-shard-cluster/#restart-the-replica-set-as-a-shard stop secondary and run $ mongod --config /data/mongod.conf --shardsvr --replSet rs stop arbitrer and run $ mongod --config /data/arbitrer.conf --shardsvr --replSet rs connect to primary and stepDown > rs.stepDown() restart old Primary with $ mongod --config /data/mongod.conf --shardsvr --replSet rs everything reconnect. After some minutes (around 5. it's cyclic) in idle one data node receive SIGSEGV and on cascade also the other data node (but not the arbitrer) receive the same SIGSEGV.
Sprint:
Sharding 2018-02-12, Sharding 2018-02-26, Sharding 2018-03-12, Sharding 2018-03-26, Sharding 2018-04-23
Case:
Confidence Status:
None
Work Order:
3
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

The starting point are 3 node: 2 data barer and 1 arbitrer. All nodes started without the flag --shardsvr. Once the replicaset is initialized (initialization + addArb) it cannot be converted to a replicated shard.

Once you restart the node with the flag --shardsvr they, after a while, access a bad memory segment (Invalid access at address: 0x18) and receive signal SIGSEGV.

If restarted again they continue to receive the same signal after some time in idle (more or less 5 minute)

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

arbitrer.conf
Jan 12 2018 11:24:32 AM UTC
0.3 kB
Gianluca De Cicco
mongo_primary.log
Jan 12 2018 11:24:32 AM UTC
11 kB
Gianluca De Cicco
mongo_secondary.log
Jan 12 2018 11:24:32 AM UTC
7 kB
Gianluca De Cicco
mongod.conf
Jan 12 2018 11:24:32 AM UTC
0.3 kB
Gianluca De Cicco

causes

SERVER-33376 MongoDB 3.6.3-rc0 config server segfaults on startup

Closed

SERVER-34746 Segmentation fault when shard is started with --shardsvr before being added to a shard

Closed

depends on

SERVER-29908 Libraries db/s/sharding and db/query/query are directly cyclic

Closed

is depended on by

TOOLS-2011 Restore sharded cluster testing after SERVER-32677

Closed

is duplicated by

SERVER-32921 Invalid access at address: 0x18

Closed

SERVER-33385 MongoDB 3.6 crashes on Ubuntu 16.04 AWS when using Cold SC1 EBS Volume

Closed

SERVER-34206 All replica nodes crash

Closed

SERVER-34530 Shard server crashes after access violation on Windows v3.7.4-6-g228106a741

Closed

related to

SERVER-71106 Access to Grid members should be protected with isShardingInitialized

Closed

(3 is duplicated by, 1 related to)

Assignee:: Blake Oler
Reporter:: Gianluca De Cicco
Participants:: Blake Oler, Gianluca De Cicco, Githook User, Gregory McKeon, Kaloian Manassiev, Mark Agarunov
Votes:: 0 Vote for this issue
Watchers:: 20 Start watching this issue

Created:: Jan 12 2018 11:27:17 AM UTC
Updated:: Oct 30 2023 11:09:21 PM UTC
Resolved:: Apr 20 2018 05:36:44 PM UTC
Confidence Status Last Update:: 18/Apr/18 7:13 PM

Details

Description

Attachments

Attachments

Issue Links

Forms

Activity

People

Dates