[SERVER-23441] MongoDB crashing with Fatal Assertion 28558 Created: 31/Mar/16  Updated: 27/Mar/17  Resolved: 06/May/16

Status: Closed
Project: Core Server
Component/s: WiredTiger
Affects Version/s: 3.2.4
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Richard Burton Assignee: Kelsey Schubert
Resolution: Incomplete Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
Operating System: ALL
Participants:

 Description   

Here's the full stack trace in the console:

Mongo was executed using simply mongod --dbpath . to paginate over a collection in 200 records at a time and after each record is processed, I issue an update.

The drive is 2TB in size so space shouldnt be an issue unless there's another path being used.

2016-03-30T20:13:32.438-0400 E STORAGE  [thread1] WiredTiger (5) [1459383212:438585][92449:0x10c62b000], file:collection-2-2055287105431075722.wt, WT_SESSION.checkpoint: collection-2-2055287105431075722.wt write error: failed to write 4096 bytes at offset 1499770880: Input/output error
2016-03-30T20:13:32.438-0400 E STORAGE  [thread1] WiredTiger (5) [1459383212:438843][92449:0x10c62b000], checkpoint-server: checkpoint server error: Input/output error
2016-03-30T20:13:32.438-0400 E STORAGE  [thread1] WiredTiger (-31804) [1459383212:438914][92449:0x10c62b000], checkpoint-server: the process must exit and restart: WT_PANIC: WiredTiger library panic
2016-03-30T20:13:32.438-0400 I -        [thread1] Fatal Assertion 28558
2016-03-30T20:13:32.438-0400 I -        [thread1]
 
***aborting after fassert() failure
 
 
2016-03-30T20:13:32.473-0400 F -        [thread1] Got signal: 6 (Abort trap: 6).
 
 0x107097f5a 0x1070978df 0x7fff92b16f1a 0x7fff88764b1d 0x7fff91e7b9ab 0x107037d4a 0x106e6a576 0x10781e591 0x10781e6d9 0x10781ec84 0x1077a9add 0x7fff92ddf05a 0x7fff92ddefd7 0x7fff92ddc3ed
----- BEGIN BACKTRACE -----
{"backtrace":[{"b":"106799000","o":"8FEF5A","s":"_ZN5mongo15printStackTraceERNSt3__113basic_ostreamIcNS0_11char_traitsIcEEEE"},{"b":"106799000","o":"8FE8DF","s":"_ZN5mongo12_GLOBAL__N_110abruptQuitEi"},{"b":"7FFF92B12000","o":"4F1A","s":"_sigtramp"},{"b":"7FFF88762000","o":"2B1D","s":"szone_malloc_should_clear"},{"b":"7FFF91E1E000","o":"5D9AB","s":"abort"},{"b":"106799000","o":"89ED4A","s":"_ZN5mongo13fassertFailedEi"},{"b":"106799000","o":"6D1576","s":"_ZN5mongo12_GLOBAL__N_116mdb_handle_errorEP18__wt_event_handlerP12__wt_sessioniPKc"},{"b":"106799000","o":"1085591","s":"__wt_eventv"},{"b":"106799000","o":"10856D9","s":"__wt_err"},{"b":"106799000","o":"1085C84","s":"__wt_panic"},{"b":"106799000","o":"1010ADD","s":"__ckpt_server"},{"b":"7FFF92DDB000","o":"405A","s":"_pthread_body"},{"b":"7FFF92DDB000","o":"3FD7","s":"_pthread_body"},{"b":"7FFF92DDB000","o":"13ED","s":"thread_start"}],"processInfo":{ "mongodbVersion" : "3.2.4", "gitVersion" : "e2ee9ffcf9f5a94fad76802e28cc978718bb7a30", "compiledModules" : [], "uname" : { "sysname" : "Darwin", "release" : "14.5.0", "version" : "Darwin Kernel Version 14.5.0: Tue Sep  1 21:23:09 PDT 2015; root:xnu-2782.50.1~1/RELEASE_X86_64", "machine" : "x86_64" }, "somap" : [ { "path" : "/usr/local/bin/mongod", "machType" : 2, "b" : "106799000", "vmaddr" : "100000000", "buildId" : "91E664E914AF30CCBF99C991AE7C3828" }, { "path" : "/usr/lib/libSystem.B.dylib", "machType" : 6, "b" : "7FFF8833F000", "vmaddr" : "7FFF83562000", "buildId" : "1866C519C5F33D098C17A8F703664521" }, { "path" : "/usr/lib/libc++.1.dylib", "machType" : 6, "b" : "7FFF8A65A000", "vmaddr" : "7FFF8587D000", "buildId" : "1B9530FD989B3174BB1CBDC159501710" }, { "path" : "/usr/lib/system/libcache.dylib", "machType" : 6, "b" : "7FFF86AA7000", "vmaddr" : "7FFF81CCA000", "buildId" : "45E9A2E799C436B2BEE30C4E11614AD1" }, { "path" : "/usr/lib/system/libcommonCrypto.dylib", "machType" : 6, "b" : "7FFF8AFBF000", "vmaddr" : "7FFF861E2000", "buildId" : "E789748DF9A73CFFB31790DF348B1E95" }, { "path" : "/usr/lib/system/libcompiler_rt.dylib", "machType" : 6, "b" : "7FFF91093000", "vmaddr" : "7FFF8C2B6000", "buildId" : "BF8FC133EE103DA69B9092039E28678F" }, { "path" : "/usr/lib/system/libcopyfile.dylib", "machType" : 6, "b" : "7FFF912DA000", "vmaddr" : "7FFF8C4FD000", "buildId" : "0C68D3A6ACDD3EF3991ACC82C32AB836" }, { "path" : "/usr/lib/system/libcorecrypto.dylib", "machType" : 6, "b" : "7FFF8A6AF000", "vmaddr" : "7FFF858D2000", "buildId" : "5779FFA04D9A3AD4B7F2618227621DC8" }, { "path" : "/usr/lib/system/libdispatch.dylib", "machType" : 6, "b" : "7FFF869DE000", "vmaddr" : "7FFF81C01000", "buildId" : "A61E703C784A3698B51375DD12AAD6DC" }, { "path" : "/usr/lib/system/libdyld.dylib", "machType" : 6, "b" : "7FFF84E11000", "vmaddr" : "7FFF80034000", "buildId" : "CFBBE540D5033AFCB5D6644F1E69949B" }, { "path" : "/usr/lib/system/libkeymgr.dylib", "machType" : 6, "b" : "7FFF9125E000", "vmaddr" : "7FFF8C481000", "buildId" : "77845842DE703CC5BD01C3D14227CED5" }, { "path" : "/usr/lib/system/liblaunch.dylib", "machType" : 6, "b" : "7FFF93E45000", "vmaddr" : "7FFF8F068000", "buildId" : "4F81CA3AD2CE3030A89D42F3DAD7BA8F" }, { "path" : "/usr/lib/system/libmacho.dylib", "machType" : 6, "b" : "7FFF8DFD0000", "vmaddr" : "7FFF891F3000", "buildId" : "126CA2EDDE91308F8881B9DAEC3C63B6" }, { "path" : "/usr/lib/system/libquarantine.dylib", "machType" : 6, "b" : "7FFF90B2F000", "vmaddr" : "7FFF8BD52000", "buildId" : "7AF900412768378A925AD83161863642" }, { "path" : "/usr/lib/system/libremovefile.dylib", "machType" : 6, "b" : "7FFF8FC94000", "vmaddr" : "7FFF8AEB7000", "buildId" : "3485B5F46CE83C628DFD8736ED6E8531" }, { "path" : "/usr/lib/system/libsystem_asl.dylib", "machType" : 6, "b" : "7FFF916AA000", "vmaddr" : "7FFF8C8CD000", "buildId" : "F153AC5B0542356E88C820A62CA704E2" }, { "path" : "/usr/lib/system/libsystem_blocks.dylib", "machType" : 6, "b" : "7FFF8A073000", "vmaddr" : "7FFF85296000", "buildId" : "9615D10AFCA73BE4AA1A1B195DACE1A1" }, { "path" : "/usr/lib/system/libsystem_c.dylib", "machType" : 6, "b" : "7FFF91E1E000", "vmaddr" : "7FFF8D041000", "buildId" : "69158EFA827030A1BA024F74A4498147" }, { "path" : "/usr/lib/system/libsystem_configuration.dylib", "machType" : 6, "b" : "7FFF90B32000", "vmaddr" : "7FFF8BD55000", "buildId" : "56F94DCEDBDE36158F07DE6270D9F8BE" }, { "path" : "/usr/lib/system/libsystem_coreservices.dylib", "machType" : 6, "b" : "7FFF8A5AC000", "vmaddr" : "7FFF857CF000", "buildId" : "41B7C5785A5331C8A96FC73E030B0938" }, { "path" : "/usr/lib/system/libsystem_coretls.dylib", "machType" : 6, "b" : "7FFF86791000", "vmaddr" : "7FFF819B4000", "buildId" : "155DA0A92046332EBFA3D7974A51F731" }, { "path" : "/usr/lib/system/libsystem_dnssd.dylib", "machType" : 6, "b" : "7FFF8A590000", "vmaddr" : "7FFF857B3000", "buildId" : "9EC5AF92D0D23BDE92B6D3730D3865C8" }, { "path" : "/usr/lib/system/libsystem_info.dylib", "machType" : 6, "b" : "7FFF948D9000", "vmaddr" : "7FFF8FAFC000", "buildId" : "2E16C4B3A32739579C41143911979A1E" }, { "path" : "/usr/lib/system/libsystem_kernel.dylib", "machType" : 6, "b" : "7FFF87CD1000", "vmaddr" : "7FFF82EF4000", "buildId" : "1EE815DAFF1B3A53AE9BC98BD8177A9D" }, { "path" : "/usr/lib/system/libsystem_m.dylib", "machType" : 6, "b" : "7FFF92DAA000", "vmaddr" : "7FFF8DFCD000", "buildId" : "1E12AB456D9636D0A226F24D9FB0D9D6" }, { "path" : "/usr/lib/system/libsystem_malloc.dylib", "machType" : 6, "b" : "7FFF88762000", "vmaddr" : "7FFF83985000", "buildId" : "DDA8928BCC0D3255BD8A3FEA0982B890" }, { "path" : "/usr/lib/system/libsystem_network.dylib", "machType" : 6, "b" : "7FFF94438000", "vmaddr" : "7FFF8F65B000", "buildId" : "6105C13467223C0AA4CE5E1261E2E1CC" }, { "path" : "/usr/lib/system/libsystem_networkextension.dylib", "machType" : 6, "b" : "7FFF92B1B000", "vmaddr" : "7FFF8DD3E000", "buildId" : "BA58B30B83773B0A8AE34F84021D9D4E" }, { "path" : "/usr/lib/system/libsystem_notify.dylib", "machType" : 6, "b" : "7FFF9109B000", "vmaddr" : "7FFF8C2BE000", "buildId" : "61147800F3203DAA850CBADF33855F29" }, { "path" : "/usr/lib/system/libsystem_platform.dylib", "machType" : 6, "b" : "7FFF92B12000", "vmaddr" : "7FFF8DD35000", "buildId" : "64E34079D7123D669CE2418624A5C040" }, { "path" : "/usr/lib/system/libsystem_pthread.dylib", "machType" : 6, "b" : "7FFF92DDB000", "vmaddr" : "7FFF8DFFE000", "buildId" : "ACE90967ECD03251AEEB461E3C6414F7" }, { "path" : "/usr/lib/system/libsystem_sandbox.dylib", "machType" : 6, "b" : "7FFF905F0000", "vmaddr" : "7FFF8B813000", "buildId" : "3F5E973FC70231AC97BC05F5C195683C" }, { "path" : "/usr/lib/system/libsystem_secinit.dylib", "machType" : 6, "b" : "7FFF86AA5000", "vmaddr" : "7FFF81CC8000", "buildId" : "581DAD0F6B633A48B63B917AF799ABAA" }, { "path" : "/usr/lib/system/libsystem_stats.dylib", "machType" : 6, "b" : "7FFF890DF000", "vmaddr" : "7FFF84302000", "buildId" : "D0E968373CF6323DB7116DF6F660E530" }, { "path" : "/usr/lib/system/libsystem_trace.dylib", "machType" : 6, "b" : "7FFF8A04C000", "vmaddr" : "7FFF8526F000", "buildId" : "840F5301B55A307890B9FEFFD6CD741A" }, { "path" : "/usr/lib/system/libunc.dylib", "machType" : 6, "b" : "7FFF8C6EB000", "vmaddr" : "7FFF8790E000", "buildId" : "5676F7EAC1DF329FB006D2C3022B7D70" }, { "path" : "/usr/lib/system/libunwind.dylib", "machType" : 6, "b" : "7FFF913ED000", "vmaddr" : "7FFF8C610000", "buildId" : "BE7E51A0B6EA3A549CCA9D88F683A6D6" }, { "path" : "/usr/lib/system/libxpc.dylib", "machType" : 6, "b" : "7FFF916C9000", "vmaddr" : "7FFF8C8EC000", "buildId" : "5C829202962E37448B5000D38CC88E84" }, { "path" : "/usr/lib/libobjc.A.dylib", "machType" : 6, "b" : "7FFF8AD98000", "vmaddr" : "7FFF85FBB000", "buildId" : "759E155DBC423D4E869B6F57D477177C" }, { "path" : "/usr/lib/libauto.dylib", "machType" : 6, "b" : "7FFF85956000", "vmaddr" : "7FFF80B79000", "buildId" : "A260789BD4D8316A9490254767B8A5F1" }, { "path" : "/usr/lib/libc++abi.dylib", "machType" : 6, "b" : "7FFF84DE5000", "vmaddr" : "7FFF80008000", "buildId" : "88A22A0F87C63002BFBAAC0F2808B8B9" }, { "path" : "/usr/lib/libDiagnosticMessagesClient.dylib", "machType" : 6, "b" : "7FFF8AFFE000", "vmaddr" : "7FFF86221000", "buildId" : "2EE8E4365CDC34C599595BA218D507FB" } ] }}
 mongod(_ZN5mongo15printStackTraceERNSt3__113basic_ostreamIcNS0_11char_traitsIcEEEE+0x3A) [0x107097f5a]
 mongod(_ZN5mongo12_GLOBAL__N_110abruptQuitEi+0xAF) [0x1070978df]
 libsystem_platform.dylib(_sigtramp+0x1A) [0x7fff92b16f1a]
 libsystem_malloc.dylib(szone_malloc_should_clear+0x476) [0x7fff88764b1d]
 libsystem_c.dylib(abort+0x81) [0x7fff91e7b9ab]
 mongod(_ZN5mongo13fassertFailedEi+0x20A) [0x107037d4a]
 mongod(_ZN5mongo12_GLOBAL__N_116mdb_handle_errorEP18__wt_event_handlerP12__wt_sessioniPKc+0x146) [0x106e6a576]
 mongod(__wt_eventv+0x4E1) [0x10781e591]
 mongod(__wt_err+0x99) [0x10781e6d9]
 mongod(__wt_panic+0x24) [0x10781ec84]
 mongod(__ckpt_server+0xDD) [0x1077a9add]
 libsystem_pthread.dylib(_pthread_body+0x83) [0x7fff92ddf05a]
 libsystem_pthread.dylib(_pthread_body+0x0) [0x7fff92ddefd7]
 libsystem_pthread.dylib(thread_start+0xD) [0x7fff92ddc3ed]
-----  END BACKTRACE  -----
2016-03-30T20:13:32.477-0400 I -        [conn1] Fatal Assertion 28559
2016-03-30T20:13:32.499-0400 I -        [conn1]
 
***aborting after fassert() failure



 Comments   
Comment by Kelsey Schubert [ 06/May/16 ]

Hi mrburton@gmail.com,

We haven’t heard back from you for some time, so I’m going to mark this ticket as resolved. If this is still an issue for you, please provide the details about your storage layer in a comment and we will reopen the ticket.

Regards,
Thomas

Comment by Kelsey Schubert [ 25/Apr/16 ]

Hi mrburton@gmail.com,

We still need the answers to my questions above to diagnose the problem. If this is still an issue for you, can you please provide the requested details about your storage layer?

Thank you,
Thomas

Comment by Kelsey Schubert [ 31/Mar/16 ]

Hi mrburton@gmail.com,

Thank you for reporting this behavior. Please answer the following questions so we can get better understanding of what is going on here.

  1. Have you checked dmesg for any storage related errors?
  2. Around this operation, were there any other server errors logged?
  3. Can you please post the output of db.collection.validate() on the affected collection?
  4. What kind of underlying storage mechanism are you using? Are the storage devices attached locally or over the network? Are the disks SSDs or HDDs? What kind of RAID and/or volume management system are you using?

Kind regards,
Thomas

Generated at Thu Feb 08 04:03:23 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.