Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-41924

Rollback occured when higher-priority PRIMARY rejoined replica set after storage failure

    • Type: Icon: Bug Bug
    • Resolution: Incomplete
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 4.0.10
    • Component/s: Replication
    • Labels:
      None
    • ALL

      Hello!

      We're testing MongoDB 4.0.10 failover capabilities and one of our test imitates block storage failure as described further.

      Replica set nodes:
      mongo-test-arb1 MONGOS
      mongo-test-arb2 MONGOS
      mongo-test-db1 ARBITER
      mongo-test-db2 PRIMARY
      mongo-test-db3 SECONDARY
      mongo-test-db4 SECONDARY
      mongo-test-db5 SECONDARY
      All nodes have 1 vote and priority 1 except PRIMARY which has priority 10.

      Test process:
      1. Run a script writing 10000 documents to the replica set using 1 thread with w:majority. Writes go through one of mongos instances.
      2. While script is running, run this command on mongo-test-db2 (PRIMARY):

      echo 1 > /sys/block/sda/device/delete ; sleep 300 ; echo b > /proc/sysrq-trigger

      This leads to primary failure and client (pymongo in our case) receives an exception:

      Write results unavailable from mongo-test-db2:27018 :: caused by :: Connection closed by peer

      3. Wait for the script finished. As soon as it received 1 exception and not designed to repeat failed writes, we must have 9999 documents written to replica set at this point, so check db.collection.count() to ensure.
      4. Wait for primary restarted (300 seconds) and recheck db.collection.count(). For such configuration I usually get numbers about 9200-9300. It means that server rollbacks about 700 documents and I can see them in rollback directory in /var/lib.

      I repeated the test several times and noted that if I set equal priorities to all nodes the problem does not occur. When primary has priority 1 it rejoins as secondary and successfully replicates all 9999 documents.

      Can someone explain such behavior of replica set? Is it a bug?

            Assignee:
            daniel.hatcher@mongodb.com Danny Hatcher (Inactive)
            Reporter:
            aanodin@gmail.com Alexander A
            Votes:
            1 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated:
              Resolved: