Details
-
Bug
-
Resolution: Done
-
Critical - P2
-
None
-
3.0.4
-
None
-
ALL
-
Description
We have a 4 replica set cluster that we were in the process of upgrading from 2.6.8 to 3.0.4 and had to rollback due to reliability issues.
After the switch the recently primary would be unresponsive for 5-10 mins and not always recover. In 1 case mongodb had to be restarted.
Before event -> a01 (p), a02 (s), m01 (s), m02 (s)
After event -> a02 (s), m01 (s), m02 (p)
NOTE: one of the replica set members a02 is still on 2.6.8, rest on 3.0.4
This was quite reproducible and occurred several times on different primaries. We have been running 2.6.8 for over a year with no such issues.
We are using the MMAP storage engine, our goal was to complete the migration to WiredTiger but this was the first step in the migration.
We are considering upgrading to 3.0.6 but wanted to get some insight into the issue before we go down that path.