Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-38358

MongoDB crashing with Fatal Assertion 28558

    • Type: Icon: Question Question
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 3.2.21
    • Component/s: None
    • Labels:
    • Environment:
      Test
    • Server Triage

      Mongo DB has crashed after reading few thousand records, below stacktrace, doesnt seem to be an issue with space left on device as it has 964G memory still available, see detailes after the below stack trace. After restarted mongodb it worked fine, so what caused this issue? how measures we can take to prevent it from happening again?

       

      2018-11-30T06:12:23.011+0000 W FTDC [ftdc] Uncaught exception in 'FileStreamFailed: Failed to write to interim file buffer for full-time diagnostic data capture: /var/lib/mongo/diagnostic.data/metrics.interim.temp' in full-time diagnostic data capture subsystem. Shutting down the full-time diagnostic data capture subsystem.
      2018-11-30T06:12:41.314+0000 E STORAGE [thread1] WiredTiger (28) [1543558361:314723][4237:0x7ff2cbbe8700], [file:index-429-232680150939889745.wt|file://index-429-232680150939889745.wt/], WT_SESSION.checkpoint: /var/lib/mongo/index-429-232680150939889745.wt: handle-write: pwrite: failed to write 4096 bytes at offset 405504: No space left on device{color}
      {color:#d04437}2018-11-30T06:12:41.314+0000 E STORAGE [thread1] WiredTiger (28) [1543558361:314771][4237:0x7ff2cbbe8700], [file:index-429-232680150939889745.wt|file://index-429-232680150939889745.wt/], WT_SESSION.checkpoint: index-429-232680150939889745.wt: fatal checkpoint failure: No space left on device
      2018-11-30T06:12:41.314+0000 E STORAGE [thread1] WiredTiger (-31804) [1543558361:314781][4237:0x7ff2cbbe8700], [file:index-429-232680150939889745.wt|file://index-429-232680150939889745.wt/], WT_SESSION.checkpoint: the process must exit and restart: WT_PANIC: WiredTiger library panic{color}
      2018-11-30T06:12:41.314+0000 I - [thread1] Fatal Assertion 28558
      2018-11-30T06:12:41.314+0000 I - [thread1]
      
      ***aborting after fassert() failure
      
      2018-11-30T06:12:41.349+0000 F - [thread1] Got signal: 6 (Aborted).
      
      0x133e5c2 0x133d4e9 0x133dcf2 0x7ff2d80b3100 0x7ff2d7d175f7 0x7ff2d7d18ce8 0x12ba9e2 0x109b613 0x1ac3438 0x1ac3635 0x1ac3803 0x19e7f71 0x19e394a 0x1a03521 0x1a9d0da 0x1aa45ff 0x1a1c108 0x1ad0280 0x1ad0538 0x1acef4a 0x1ad17da 0x1ad227b 0x1abe250 0x1a3ab2d 0x7ff2d80abdc5 0x7ff2d7dd8c9d
      ----- BEGIN BACKTRACE -----
      {"backtrace":[
      
      {"b":"400000","o":"F3E5C2","s":"_ZN5mongo15printStackTraceERSo"}
      
      ,
      
      {"b":"400000","o":"F3D4E9"}
      
      ,
      
      {"b":"400000","o":"F3DCF2"}
      
      ,
      
      {"b":"7FF2D80A4000","o":"F100"}
      
      ,
      
      {"b":"7FF2D7CE2000","o":"355F7","s":"gsignal"}
      
      ,
      
      {"b":"7FF2D7CE2000","o":"36CE8","s":"abort"}
      
      ,
      
      {"b":"400000","o":"EBA9E2","s":"_ZN5mongo13fassertFailedEi"}
      
      ,
      
      {"b":"400000","o":"C9B613"}
      
      ,
      
      {"b":"400000","o":"16C3438","s":"__wt_eventv"}
      
      ,
      
      {"b":"400000","o":"16C3635","s":"__wt_err"}
      
      ,
      
      {"b":"400000","o":"16C3803","s":"__wt_panic"}
      
      ,
      
      {"b":"400000","o":"15E7F71","s":"__wt_block_panic"}
      
      ,
      
      {"b":"400000","o":"15E394A","s":"__wt_block_checkpoint"}
      
      ,
      
      {"b":"400000","o":"1603521","s":"__wt_bt_write"}
      
      ,
      
      {"b":"400000","o":"169D0DA"}
      
      ,
      
      {"b":"400000","o":"16A45FF","s":"__wt_reconcile"}
      
      ,
      
      {"b":"400000","o":"161C108","s":"__wt_cache_op"}
      
      ,
      
      {"b":"400000","o":"16D0280"}
      
      ,
      
      {"b":"400000","o":"16D0538"}
      
      ,
      
      {"b":"400000","o":"16CEF4A"}
      
      ,
      
      {"b":"400000","o":"16D17DA"}
      
      ,
      
      {"b":"400000","o":"16D227B","s":"__wt_txn_checkpoint"}
      
      ,
      
      {"b":"400000","o":"16BE250"}
      
      ,
      
      {"b":"400000","o":"163AB2D"}
      
      ,
      
      {"b":"7FF2D80A4000","o":"7DC5"}
      
      ,
      
      {"b":"7FF2D7CE2000","o":"F6C9D","s":"clone"}
      
      ],"processInfo":{ "mongodbVersion" : "3.2.21", "gitVersion" : "1ab1010737145ba3761318508ff65ba74dfe8155", "compiledModules" : [], "uname" :
      
      { "sysname" : "Linux", "release" : "4.4.23-31.54.amzn1.x86_64", "version" : "#1 SMP Tue Oct 18 22:02:09 UTC 2016", "machine" : "x86_64" }
      
      , "somap" : [ { "elfType" : 2, "b" : "400000", 2018-11-30T22:49:47.063+0000 I CONTROL [main] ***** SERVER RESTARTED *****
      2018-11-30T22:49:47.069+0000 I CONTROL [initandlisten] MongoDB starting : pid=6278 port=27017 dbpath=/var/lib/mongo 64-bit host=ip-192-168-200-110
      

       

      Output of df -H

      Filesystem Size Used Avail Use% Mounted on
      devtmpfs 65G 66k 65G 1% /dev
      tmpfs 65G 0 65G 0% /dev/shm
      /dev/xvda1 317G 250G 68G 79% /
      /dev/xvdb 4.3T 3.1T 964G 76% /data
      

            Assignee:
            backlog-server-triage [HELP ONLY] Backlog - Triage Team
            Reporter:
            josephvarghesep Joseph Varghese
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated:
              Resolved: