-
Type: Bug
-
Resolution: Incomplete
-
Priority: Major - P3
-
None
-
Affects Version/s: 4.0.10
-
Component/s: Replication
-
Labels:None
-
ALL
Hello!
We're testing MongoDB 4.0.10 failover capabilities and one of our test imitates block storage failure as described further.
Replica set nodes:
mongo-test-arb1 MONGOS
mongo-test-arb2 MONGOS
mongo-test-db1 ARBITER
mongo-test-db2 PRIMARY
mongo-test-db3 SECONDARY
mongo-test-db4 SECONDARY
mongo-test-db5 SECONDARY
All nodes have 1 vote and priority 1 except PRIMARY which has priority 10.
Test process:
1. Run a script writing 10000 documents to the replica set using 1 thread with w:majority. Writes go through one of mongos instances.
2. While script is running, run this command on mongo-test-db2 (PRIMARY):
echo 1 > /sys/block/sda/device/delete ; sleep 300 ; echo b > /proc/sysrq-trigger
This leads to primary failure and client (pymongo in our case) receives an exception:
Write results unavailable from mongo-test-db2:27018 :: caused by :: Connection closed by peer
3. Wait for the script finished. As soon as it received 1 exception and not designed to repeat failed writes, we must have 9999 documents written to replica set at this point, so check db.collection.count() to ensure.
4. Wait for primary restarted (300 seconds) and recheck db.collection.count(). For such configuration I usually get numbers about 9200-9300. It means that server rollbacks about 700 documents and I can see them in rollback directory in /var/lib.
I repeated the test several times and noted that if I set equal priorities to all nodes the problem does not occur. When primary has priority 1 it rejoins as secondary and successfully replicates all 9999 documents.
Can someone explain such behavior of replica set? Is it a bug?