Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Incomplete
Priority: Major - P3
Fix Version/s: None
Affects Version/s: 4.0.10
Component/s: Replication
Labels:
None

Operating System:
ALL
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

Hello!

We're testing MongoDB 4.0.10 failover capabilities and one of our test imitates block storage failure as described further.

Replica set nodes:
mongo-test-arb1 MONGOS
mongo-test-arb2 MONGOS
mongo-test-db1 ARBITER
mongo-test-db2 PRIMARY
mongo-test-db3 SECONDARY
mongo-test-db4 SECONDARY
mongo-test-db5 SECONDARY
All nodes have 1 vote and priority 1 except PRIMARY which has priority 10.

Test process:
1. Run a script writing 10000 documents to the replica set using 1 thread with w:majority. Writes go through one of mongos instances.
2. While script is running, run this command on mongo-test-db2 (PRIMARY):

echo 1 > /sys/block/sda/device/delete ; sleep 300 ; echo b > /proc/sysrq-trigger

This leads to primary failure and client (pymongo in our case) receives an exception:

Write results unavailable from mongo-test-db2:27018 :: caused by :: Connection closed by peer

3. Wait for the script finished. As soon as it received 1 exception and not designed to repeat failed writes, we must have 9999 documents written to replica set at this point, so check db.collection.count() to ensure.
4. Wait for primary restarted (300 seconds) and recheck db.collection.count(). For such configuration I usually get numbers about 9200-9300. It means that server rollbacks about 700 documents and I can see them in rollback directory in /var/lib.

I repeated the test several times and noted that if I set equal priorities to all nodes the problem does not occur. When primary has priority 1 it rejoins as secondary and successfully replicates all 9999 documents.

Can someone explain such behavior of replica set? Is it a bug?

Assignee:: Danny Hatcher (Inactive)
Reporter:: Alexander A
Participants:: Alexander A, Danny Hatcher
Votes:: 1 Vote for this issue
Watchers:: 6 Start watching this issue

Created:: Jun 26 2019 07:37:33 AM UTC
Updated:: Oct 16 2021 12:19:27 AM UTC
Resolved:: Sep 04 2019 06:38:51 PM UTC

Details

Description

Attachments

Activity

People

Dates