-
Type:
Bug
-
Resolution: Incomplete
-
Priority:
Major - P3
-
None
-
Affects Version/s: 2.4.6
-
Component/s: Replication
-
None
-
ALL
-
None
-
3
-
None
-
None
-
None
-
None
-
None
-
None
We have a setup with one 3-replica replicaset; during a series of successive restarts (coming from our deployment infra) at some point the secondaries went both down with:
ERROR: writer worker caught exception: E11000 duplicate key error index ...
attaching the logs of the 3 replica around the event. LOG2 and LOG3 are the crashing secondaries, LOG1 is the continuing to run original primary.
That data referenced there should have been already in the system, there could have been though a insert at that time going on that should have just produced a dup error to the client side.
After a while we restarted the 3 replica manually and they came up properly, we checked if actual dup data was present but that wasn't the case.
As can be seen from the logs this involves a multikey unique index.
This may like a still present corner case of SERVER-6671 .