transient but crashing ERROR: writer worker caught exception: E11000 duplicate key error index situation

XMLWordPrintableJSON

    • Type: Bug
    • Resolution: Incomplete
    • Priority: Major - P3
    • None
    • Affects Version/s: 2.4.6
    • Component/s: Replication
    • None
    • ALL
    • None
    • 3
    • None
    • None
    • None
    • None
    • None
    • None

      We have a setup with one 3-replica replicaset; during a series of successive restarts (coming from our deployment infra) at some point the secondaries went both down with:

      ERROR: writer worker caught exception: E11000 duplicate key error index ...

      attaching the logs of the 3 replica around the event. LOG2 and LOG3 are the crashing secondaries, LOG1 is the continuing to run original primary.

      That data referenced there should have been already in the system, there could have been though a insert at that time going on that should have just produced a dup error to the client side.

      After a while we restarted the 3 replica manually and they came up properly, we checked if actual dup data was present but that wasn't the case.

      As can be seen from the logs this involves a multikey unique index.

      This may like a still present corner case of SERVER-6671 .

        1. LOG1
          1.61 MB
          Samuele Pedroni
        2. LOG2
          68 kB
          Samuele Pedroni
        3. LOG3
          20 kB
          Samuele Pedroni

            Assignee:
            Ramon Fernandez Marina
            Reporter:
            Samuele Pedroni
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: