Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-22482

Cache growing to 100% followed by crash

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Duplicate
    • Affects Version/s: 3.2.1
    • Fix Version/s: None
    • Component/s: WiredTiger
    • Labels:
      None
    • Operating System:
      ALL
    • Steps To Reproduce:
      Hide

      Start primary shard server during a busy day and wait a couple of hours

      Show
      Start primary shard server during a busy day and wait a couple of hours

      Description

      The primary server on my primary shard has encountered crashing problems repeatedly during production. Mongostat reports the used% of cache growing to 100% (and sometimes 101%) and the dirty % to over 90%. When this situation occurs it is just a matter of time until the server crashes. Memory size and res do not grow to the point where the server crashes because it is out of memory.

      Opening the log file for either server (this happened to both the promoted secondary as well as the primary) I find thousands of lines with these error messages:

      2016-02-04T23:46:00.871+0000 I STORAGE  [conn40] WTIndex::updatePosition -- the new key ( 2E15D5ECA4877FFFFEB27DF76BB90446AE2462) is less than the previous key (2E4B485BC8877FFFFEAD53C76DEB044B83D4DA), which is a bug.
      2016-02-04T23:46:00.871+0000 I STORAGE  [conn40] WTIndex::updatePosition -- the new key ( 140402C0) is less than the previous key (3C0A3238324B204A4F45203A536F43616C4A4F4542000447951242), which is a bug.
      2016-02-04T23:46:00.871+0000 I STORAGE  [conn40] WTIndex::updatePosition -- the new key ( 2E15D5F308EB0442B6EA52) is less than the previous key (2E4B485BC8877FFFFEAD526FF19B044B83B3EA), which is a bug.
      2016-02-04T23:46:00.871+0000 I STORAGE  [conn40] WTIndex::updatePosition -- the new key ( 2E15D5F3082E2A7DC42278800001464CDE2D0678800001464CDE2D060442B6EA52) is less than the previous key (2E4B485BC82E01D860A07880000152AC3892147880000152AC389214044B83D492), which is a bug.
      2016-02-04T23:46:00.871+0000 I STORAGE  [conn40] WTIndex::updatePosition -- the new key ( 2E15D5F308877FFFFEB9B321D2F90442B6EA52) is less than the previous key (2E4B485BC8877FFFFEAD53C76DEB044B83D4DA), which is a bug.
      2016-02-04T23:46:00.871+0000 I STORAGE  [conn40] WTIndex::updatePosition -- the new key ( 140402E8) is less than the previous key (3C0A3238324B204A4F45203A536F43616C4A4F4542000447951242), which is a bug.
      2016-02-04T23:46:00.871+0000 I STORAGE  [conn40] WTIndex::updatePosition -- the new key ( 2E15D5F478EB04445670AA) is less than the previous key (2E4B485BC8877FFFFEAD526FF19B044B83B3EA), which is a bug.
      2016-02-04T23:46:00.871+0000 I STORAGE  [conn40] WTIndex::updatePosition -- the new key ( 2E15D5F4782D317A2E7880000149E9C7C8D27880000149E9C7C8D204445670AA) is less than the previous key (2E4B485BC82E01D860A07880000152AC3892147880000152AC389214044B83D492), which is a bug.
      2016-02-04T23:46:00.871+0000 I STORAGE  [conn40] WTIndex::updatePosition -- the new key ( 2E15D5F478877FFFFEB61638372D04445670AA) is less than the previous key (2E4B485BC8877FFFFEAD53C76DEB044B83D4DA), which is a bug.
      2016-02-04T23:46:00.871+0000 I STORAGE  [conn40] WTIndex::updatePosition -- the new key ( 14040300) is less than the previous key (3C0A3238324B204A4F45203A536F43616C4A4F4542000447951242), which is a bug.
      2016-02-04T23:46:00.871+0000 I STORAGE  [conn40] WTIndex::updatePosition -- the new key ( 2E15D5F5FA877FFFFEAD520BCC4D0445D35EFA) is less than the previous key (2E4B485BC8877FFFFEAD526FF19B044B83B3EA), which is a bug.
      2016-02-04T23:46:00.871+0000 I STORAGE  [conn40] WTIndex::updatePosition -- the new key ( 2E15D5F5FA2D27E306788000014BFD075515788000014BFD07551504453ACEF2) is less than the previous key (2E4B485BC82E01D860A07880000152AC3892147880000152AC389214044B83D492), which is a bug.
      2016-02-04T23:46:00.871+0000 I STORAGE  [conn40] WTIndex::updatePosition -- the new key ( 2E15D5F5FA877FFFFEB06628460F044880E17A) is less than the previous key (2E4B485BC8877FFFFEAD53C76DEB044B83D4DA), which is a bug.
      2016-02-04T23:46:00.871+0000 I STORAGE  [conn40] WTIndex::updatePosition -- the new key ( 14040308) is less than the previous key (3C0A3238324B204A4F45203A536F43616C4A4F4542000447951242), which is a bug.
      2016-02-04T23:46:00.871+0000 I STORAGE  [conn40] WTIndex::updatePosition -- the new key ( 2E15D5F5FA877FFFFEAD5271F85C04475D1322) is less than the previous key (2E4B485BC8877FFFFEAD526FF19B044B83B3EA), which is a bug.
      2016-02-04T23:46:00.871+0000 I STORAGE  [conn40] WTIndex::updatePosition -- the new key ( 2E15D5F5FA2E013EB10478800001497BF3CA7378800001497BF3CA73044431CE52) is less than the previous key (2E4B485BC82E01D860A07880000152AC3892147880000152AC389214044B83D492), which is a bug.
      2016-02-04T23:46:00.870+0000 I STORAGE  [conn42] WTIndex::updatePosition -- the new key ( 2E15D5F5FAEB0442EEBC72) is less than the previous key (2E4B485BC8877FFFFEAD526FF19B044B83B41A), which is a bug.
      2016-02-04T23:46:00.872+0000 I STORAGE  [conn40] WTIndex::updatePosition -- the new key ( 2E15D5F5FA877FFFFEB1AD52F4D804475D1322) is less than the previous key (2E4B485BC8877FFFFEAD53C76DEB044B83D4DA), which is a bug.
      2016-02-04T23:46:00.872+0000 I STORAGE  [conn42] WTIndex::updatePosition -- the new key ( 2E15D5F5FA2EC2DA5340788000014E52AD0B27788000014E52AD0B2704475D1322) is less than the previous key (2E4B485BC82E01DE84C27880000152AA3D0BAB7880000152AA3D0BAB044B8272B2), which is a bug.
      2016-02-04T23:46:00.872+0000 I STORAGE  [conn42] WTIndex::updatePosition -- the new key ( 2E15D5F5FA877FFFFEB9156C2AD90442EEBC72) is less than the previous key (2E4B485BC8877FFFFEAD53C76DEB044B83D4EA), which is a bug.
      

      The end of the log has no crash information at all.

        Attachments

        1. diagnostics.zip
          87.55 MB
        2. ftdc.png
          ftdc.png
          194 kB
        3. shard2-crash.log
          17 kB

          Issue Links

            Activity

              People

              Assignee:
              kelsey.schubert Kelsey T Schubert
              Reporter:
              miketempleman Mike Templeman
              Participants:
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: