Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-22482

Cache growing to 100% followed by crash

    XMLWordPrintableJSON

Details

    • Icon: Bug Bug
    • Resolution: Duplicate
    • Icon: Major - P3 Major - P3
    • None
    • 3.2.1
    • WiredTiger
    • None
    • ALL
    • Hide

      Start primary shard server during a busy day and wait a couple of hours

      Show
      Start primary shard server during a busy day and wait a couple of hours

    Description

      The primary server on my primary shard has encountered crashing problems repeatedly during production. Mongostat reports the used% of cache growing to 100% (and sometimes 101%) and the dirty % to over 90%. When this situation occurs it is just a matter of time until the server crashes. Memory size and res do not grow to the point where the server crashes because it is out of memory.

      Opening the log file for either server (this happened to both the promoted secondary as well as the primary) I find thousands of lines with these error messages:

      2016-02-04T23:46:00.871+0000 I STORAGE  [conn40] WTIndex::updatePosition -- the new key ( 2E15D5ECA4877FFFFEB27DF76BB90446AE2462) is less than the previous key (2E4B485BC8877FFFFEAD53C76DEB044B83D4DA), which is a bug.
      2016-02-04T23:46:00.871+0000 I STORAGE  [conn40] WTIndex::updatePosition -- the new key ( 140402C0) is less than the previous key (3C0A3238324B204A4F45203A536F43616C4A4F4542000447951242), which is a bug.
      2016-02-04T23:46:00.871+0000 I STORAGE  [conn40] WTIndex::updatePosition -- the new key ( 2E15D5F308EB0442B6EA52) is less than the previous key (2E4B485BC8877FFFFEAD526FF19B044B83B3EA), which is a bug.
      2016-02-04T23:46:00.871+0000 I STORAGE  [conn40] WTIndex::updatePosition -- the new key ( 2E15D5F3082E2A7DC42278800001464CDE2D0678800001464CDE2D060442B6EA52) is less than the previous key (2E4B485BC82E01D860A07880000152AC3892147880000152AC389214044B83D492), which is a bug.
      2016-02-04T23:46:00.871+0000 I STORAGE  [conn40] WTIndex::updatePosition -- the new key ( 2E15D5F308877FFFFEB9B321D2F90442B6EA52) is less than the previous key (2E4B485BC8877FFFFEAD53C76DEB044B83D4DA), which is a bug.
      2016-02-04T23:46:00.871+0000 I STORAGE  [conn40] WTIndex::updatePosition -- the new key ( 140402E8) is less than the previous key (3C0A3238324B204A4F45203A536F43616C4A4F4542000447951242), which is a bug.
      2016-02-04T23:46:00.871+0000 I STORAGE  [conn40] WTIndex::updatePosition -- the new key ( 2E15D5F478EB04445670AA) is less than the previous key (2E4B485BC8877FFFFEAD526FF19B044B83B3EA), which is a bug.
      2016-02-04T23:46:00.871+0000 I STORAGE  [conn40] WTIndex::updatePosition -- the new key ( 2E15D5F4782D317A2E7880000149E9C7C8D27880000149E9C7C8D204445670AA) is less than the previous key (2E4B485BC82E01D860A07880000152AC3892147880000152AC389214044B83D492), which is a bug.
      2016-02-04T23:46:00.871+0000 I STORAGE  [conn40] WTIndex::updatePosition -- the new key ( 2E15D5F478877FFFFEB61638372D04445670AA) is less than the previous key (2E4B485BC8877FFFFEAD53C76DEB044B83D4DA), which is a bug.
      2016-02-04T23:46:00.871+0000 I STORAGE  [conn40] WTIndex::updatePosition -- the new key ( 14040300) is less than the previous key (3C0A3238324B204A4F45203A536F43616C4A4F4542000447951242), which is a bug.
      2016-02-04T23:46:00.871+0000 I STORAGE  [conn40] WTIndex::updatePosition -- the new key ( 2E15D5F5FA877FFFFEAD520BCC4D0445D35EFA) is less than the previous key (2E4B485BC8877FFFFEAD526FF19B044B83B3EA), which is a bug.
      2016-02-04T23:46:00.871+0000 I STORAGE  [conn40] WTIndex::updatePosition -- the new key ( 2E15D5F5FA2D27E306788000014BFD075515788000014BFD07551504453ACEF2) is less than the previous key (2E4B485BC82E01D860A07880000152AC3892147880000152AC389214044B83D492), which is a bug.
      2016-02-04T23:46:00.871+0000 I STORAGE  [conn40] WTIndex::updatePosition -- the new key ( 2E15D5F5FA877FFFFEB06628460F044880E17A) is less than the previous key (2E4B485BC8877FFFFEAD53C76DEB044B83D4DA), which is a bug.
      2016-02-04T23:46:00.871+0000 I STORAGE  [conn40] WTIndex::updatePosition -- the new key ( 14040308) is less than the previous key (3C0A3238324B204A4F45203A536F43616C4A4F4542000447951242), which is a bug.
      2016-02-04T23:46:00.871+0000 I STORAGE  [conn40] WTIndex::updatePosition -- the new key ( 2E15D5F5FA877FFFFEAD5271F85C04475D1322) is less than the previous key (2E4B485BC8877FFFFEAD526FF19B044B83B3EA), which is a bug.
      2016-02-04T23:46:00.871+0000 I STORAGE  [conn40] WTIndex::updatePosition -- the new key ( 2E15D5F5FA2E013EB10478800001497BF3CA7378800001497BF3CA73044431CE52) is less than the previous key (2E4B485BC82E01D860A07880000152AC3892147880000152AC389214044B83D492), which is a bug.
      2016-02-04T23:46:00.870+0000 I STORAGE  [conn42] WTIndex::updatePosition -- the new key ( 2E15D5F5FAEB0442EEBC72) is less than the previous key (2E4B485BC8877FFFFEAD526FF19B044B83B41A), which is a bug.
      2016-02-04T23:46:00.872+0000 I STORAGE  [conn40] WTIndex::updatePosition -- the new key ( 2E15D5F5FA877FFFFEB1AD52F4D804475D1322) is less than the previous key (2E4B485BC8877FFFFEAD53C76DEB044B83D4DA), which is a bug.
      2016-02-04T23:46:00.872+0000 I STORAGE  [conn42] WTIndex::updatePosition -- the new key ( 2E15D5F5FA2EC2DA5340788000014E52AD0B27788000014E52AD0B2704475D1322) is less than the previous key (2E4B485BC82E01DE84C27880000152AA3D0BAB7880000152AA3D0BAB044B8272B2), which is a bug.
      2016-02-04T23:46:00.872+0000 I STORAGE  [conn42] WTIndex::updatePosition -- the new key ( 2E15D5F5FA877FFFFEB9156C2AD90442EEBC72) is less than the previous key (2E4B485BC8877FFFFEAD53C76DEB044B83D4EA), which is a bug.
      

      The end of the log has no crash information at all.

      Attachments

        1. shard2-crash.log
          17 kB
        2. ftdc.png
          ftdc.png
          194 kB
        3. diagnostics.zip
          87.55 MB

        Activity

          People

            kelsey.schubert@mongodb.com Kelsey Schubert
            miketempleman Mike Templeman
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: