Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-29152

Segfault in multiple shard primaries under regular load

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Critical - P2
    • Resolution: Fixed
    • Affects Version/s: 3.2.13
    • Fix Version/s: 3.2.14, 3.4.5, 3.5.9
    • Component/s: Networking
    • Labels:
      None
    • Backwards Compatibility:
      Fully Compatible
    • Operating System:
      ALL
    • Backport Requested:
      v3.4, v3.2

      Description

      Our database is divided into 4 shards, each having one primary, secondary and arbiter. Primaries are r4.2xlarge servers on AWS EC2, and secondaries are r4.xlarge.

      Our work load is intensive in both reads and writes, but these servers usually handle the load without a problem. However during their regular work, primaries of 3 of the 4 shards suddenly crashed, within a very short time of each other. We don't know what could have caused this.

      Attached are the logs of the segfaults from the primary servers. The one from shard1 seems different that the other two.

        Attachments

        1. shard0-primary.txt
          182 kB
        2. shard1-primary.txt
          4 kB
        3. shard2-primary.txt
          272 kB

          Issue Links

            Activity

              People

              • Votes:
                2 Vote for this issue
                Watchers:
                15 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: