Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-22482

Cache growing to 100% followed by crash

    XMLWordPrintable

Details

    • Bug
    • Status: Closed
    • Major - P3
    • Resolution: Duplicate
    • 3.2.1
    • None
    • WiredTiger
    • None
    • ALL
    • Hide

      Start primary shard server during a busy day and wait a couple of hours

      Show
      Start primary shard server during a busy day and wait a couple of hours

    Description

      The primary server on my primary shard has encountered crashing problems repeatedly during production. Mongostat reports the used% of cache growing to 100% (and sometimes 101%) and the dirty % to over 90%. When this situation occurs it is just a matter of time until the server crashes. Memory size and res do not grow to the point where the server crashes because it is out of memory.

      Opening the log file for either server (this happened to both the promoted secondary as well as the primary) I find thousands of lines with these error messages:

      2016-02-04T23:46:00.871+0000 I STORAGE  [conn40] WTIndex::updatePosition -- the new key ( 2E15D5ECA4877FFFFEB27DF76BB90446AE2462) is less than the previous key (2E4B485BC8877FFFFEAD53C76DEB044B83D4DA), which is a bug.
      2016-02-04T23:46:00.871+0000 I STORAGE  [conn40] WTIndex::updatePosition -- the new key ( 140402C0) is less than the previous key (3C0A3238324B204A4F45203A536F43616C4A4F4542000447951242), which is a bug.
      2016-02-04T23:46:00.871+0000 I STORAGE  [conn40] WTIndex::updatePosition -- the new key ( 2E15D5F308EB0442B6EA52) is less than the previous key (2E4B485BC8877FFFFEAD526FF19B044B83B3EA), which is a bug.
      2016-02-04T23:46:00.871+0000 I STORAGE  [conn40] WTIndex::updatePosition -- the new key ( 2E15D5F3082E2A7DC42278800001464CDE2D0678800001464CDE2D060442B6EA52) is less than the previous key (2E4B485BC82E01D860A07880000152AC3892147880000152AC389214044B83D492), which is a bug.
      2016-02-04T23:46:00.871+0000 I STORAGE  [conn40] WTIndex::updatePosition -- the new key ( 2E15D5F308877FFFFEB9B321D2F90442B6EA52) is less than the previous key (2E4B485BC8877FFFFEAD53C76DEB044B83D4DA), which is a bug.
      2016-02-04T23:46:00.871+0000 I STORAGE  [conn40] WTIndex::updatePosition -- the new key ( 140402E8) is less than the previous key (3C0A3238324B204A4F45203A536F43616C4A4F4542000447951242), which is a bug.
      2016-02-04T23:46:00.871+0000 I STORAGE  [conn40] WTIndex::updatePosition -- the new key ( 2E15D5F478EB04445670AA) is less than the previous key (2E4B485BC8877FFFFEAD526FF19B044B83B3EA), which is a bug.
      2016-02-04T23:46:00.871+0000 I STORAGE  [conn40] WTIndex::updatePosition -- the new key ( 2E15D5F4782D317A2E7880000149E9C7C8D27880000149E9C7C8D204445670AA) is less than the previous key (2E4B485BC82E01D860A07880000152AC3892147880000152AC389214044B83D492), which is a bug.
      2016-02-04T23:46:00.871+0000 I STORAGE  [conn40] WTIndex::updatePosition -- the new key ( 2E15D5F478877FFFFEB61638372D04445670AA) is less than the previous key (2E4B485BC8877FFFFEAD53C76DEB044B83D4DA), which is a bug.
      2016-02-04T23:46:00.871+0000 I STORAGE  [conn40] WTIndex::updatePosition -- the new key ( 14040300) is less than the previous key (3C0A3238324B204A4F45203A536F43616C4A4F4542000447951242), which is a bug.
      2016-02-04T23:46:00.871+0000 I STORAGE  [conn40] WTIndex::updatePosition -- the new key ( 2E15D5F5FA877FFFFEAD520BCC4D0445D35EFA) is less than the previous key (2E4B485BC8877FFFFEAD526FF19B044B83B3EA), which is a bug.
      2016-02-04T23:46:00.871+0000 I STORAGE  [conn40] WTIndex::updatePosition -- the new key ( 2E15D5F5FA2D27E306788000014BFD075515788000014BFD07551504453ACEF2) is less than the previous key (2E4B485BC82E01D860A07880000152AC3892147880000152AC389214044B83D492), which is a bug.
      2016-02-04T23:46:00.871+0000 I STORAGE  [conn40] WTIndex::updatePosition -- the new key ( 2E15D5F5FA877FFFFEB06628460F044880E17A) is less than the previous key (2E4B485BC8877FFFFEAD53C76DEB044B83D4DA), which is a bug.
      2016-02-04T23:46:00.871+0000 I STORAGE  [conn40] WTIndex::updatePosition -- the new key ( 14040308) is less than the previous key (3C0A3238324B204A4F45203A536F43616C4A4F4542000447951242), which is a bug.
      2016-02-04T23:46:00.871+0000 I STORAGE  [conn40] WTIndex::updatePosition -- the new key ( 2E15D5F5FA877FFFFEAD5271F85C04475D1322) is less than the previous key (2E4B485BC8877FFFFEAD526FF19B044B83B3EA), which is a bug.
      2016-02-04T23:46:00.871+0000 I STORAGE  [conn40] WTIndex::updatePosition -- the new key ( 2E15D5F5FA2E013EB10478800001497BF3CA7378800001497BF3CA73044431CE52) is less than the previous key (2E4B485BC82E01D860A07880000152AC3892147880000152AC389214044B83D492), which is a bug.
      2016-02-04T23:46:00.870+0000 I STORAGE  [conn42] WTIndex::updatePosition -- the new key ( 2E15D5F5FAEB0442EEBC72) is less than the previous key (2E4B485BC8877FFFFEAD526FF19B044B83B41A), which is a bug.
      2016-02-04T23:46:00.872+0000 I STORAGE  [conn40] WTIndex::updatePosition -- the new key ( 2E15D5F5FA877FFFFEB1AD52F4D804475D1322) is less than the previous key (2E4B485BC8877FFFFEAD53C76DEB044B83D4DA), which is a bug.
      2016-02-04T23:46:00.872+0000 I STORAGE  [conn42] WTIndex::updatePosition -- the new key ( 2E15D5F5FA2EC2DA5340788000014E52AD0B27788000014E52AD0B2704475D1322) is less than the previous key (2E4B485BC82E01DE84C27880000152AA3D0BAB7880000152AA3D0BAB044B8272B2), which is a bug.
      2016-02-04T23:46:00.872+0000 I STORAGE  [conn42] WTIndex::updatePosition -- the new key ( 2E15D5F5FA877FFFFEB9156C2AD90442EEBC72) is less than the previous key (2E4B485BC8877FFFFEAD53C76DEB044B83D4EA), which is a bug.
      

      The end of the log has no crash information at all.

      Attachments

        1. ftdc.png
          194 kB
          Ramon Fernandez Marina
        2. shard2-crash.log
          17 kB
          Mike Templeman

        Issue Links

          Activity

            People

              kelsey.schubert@mongodb.com Kelsey T Schubert
              miketempleman Mike Templeman
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: