[SERVER-11862] Mongo crashes with corrupted Deleted record list corrupted in bucket 11, link number 28, invalid link is 1220333744:3f2218b0, throwing Fatal Assertion Created: 26/Nov/13  Updated: 11/Jul/16  Resolved: 16/Dec/13

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Blocker - P1
Reporter: igor lasic Assignee: Unassigned
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Linux mongo 2.4.8


Operating System: ALL
Participants:

 Description   

Mon Nov 25 22:33:06.384 [conn1705] ihm.encounter Deleted record list corrupted in bucket 11, link number 28, invalid link is 1220333744:3f2218b0, throwing Fatal Assertion
Mon Nov 25 22:33:06.384 [conn1705] ihm.encounter Fatal Assertion 16469
0xde05e1 0xda03d3 0xa5f998 0xa60212 0xac5793 0xac7f10 0xa8fbb3 0xa93847 0x9f6b78 0x9fc0f8 0x6e83a8 0xdccbae 0x3a00207851 0x39ffee894d
/usr/bin/mongod(_ZN5mongo15printStackTraceERSo+0x21) [0xde05e1]
/usr/bin/mongod(_ZN5mongo13fassertFailedEi+0xa3) [0xda03d3]
/usr/bin/mongod(ZN5mongo16NamespaceDetails10_stdAllocEib+0x488) [0xa5f998]
/usr/bin/mongod(_ZN5mongo16NamespaceDetails13allocWillBeAtEPKci+0x32) [0xa60212]
/usr/bin/mongod(_ZN5mongo11DataFileMgr6insertEPKcPKvibbbPb+0x1153) [0xac5793]
/usr/bin/mongod(_ZN5mongo11DataFileMgr12updateRecordEPKcPNS_16NamespaceDetailsEPNS_25NamespaceDetailsTransientEPNS_6RecordERKNS_7DiskLocES2_iRNS_7OpDebugEb+0x460) [0xac7f10]
/usr/bin/mongod(_ZN5mongo14_updateObjectsEbPKcRKNS_7BSONObjES4_bbbRNS_7OpDebugEPNS_11RemoveSaverEbRKNS_24QueryPlanSelectionPolicyEb+0x1403) [0xa8fbb3]
/usr/bin/mongod(_ZN5mongo13updateObjectsEPKcRKNS_7BSONObjES4_bbbRNS_7OpDebugEbRKNS_24QueryPlanSelectionPolicyE+0xb7) [0xa93847]
/usr/bin/mongod(_ZN5mongo14receivedUpdateERNS_7MessageERNS_5CurOpE+0x4d8) [0x9f6b78]
/usr/bin/mongod(_ZN5mongo16assembleResponseERNS_7MessageERNS_10DbResponseERKNS_11HostAndPortE+0xac8) [0x9fc0f8]
/usr/bin/mongod(_ZN5mongo16MyMessageHandler7processERNS_7MessageEPNS_21AbstractMessagingPortEPNS_9LastErrorE+0x98) [0x6e83a8]
/usr/bin/mongod(_ZN5mongo17PortMessageServer17handleIncomingMsgEPv+0x42e) [0xdccbae]
/lib64/libpthread.so.0() [0x3a00207851]
/lib64/libc.so.6(clone+0x6d) [0x39ffee894d]
Mon Nov 25 22:33:06.611 [conn1705]

***aborting after fassert() failure

Mon Nov 25 22:33:06.611 Got signal: 6 (Aborted).

Mon Nov 25 22:33:06.620 Backtrace:
0xde05e1 0x6d0559 0x39ffe32960 0x39ffe328e5 0x39ffe340c5 0xda040e 0xa5f998 0xa60212 0xac5793 0xac7f10 0xa8fbb3 0xa93847 0x9f6b78 0x9fc0f8 0x6e83a8 0xdccbae 0x3a00207851 0x39ffee894d
/usr/bin/mongod(_ZN5mongo15printStackTraceERSo+0x21) [0xde05e1]
/usr/bin/mongod(_ZN5mongo10abruptQuitEi+0x399) [0x6d0559]
/lib64/libc.so.6() [0x39ffe32960]
/lib64/libc.so.6(gsignal+0x35) [0x39ffe328e5]
/lib64/libc.so.6(abort+0x175) [0x39ffe340c5]
/usr/bin/mongod(_ZN5mongo13fassertFailedEi+0xde) [0xda040e]
/usr/bin/mongod(ZN5mongo16NamespaceDetails10_stdAllocEib+0x488) [0xa5f998]
/usr/bin/mongod(_ZN5mongo16NamespaceDetails13allocWillBeAtEPKci+0x32) [0xa60212]
/usr/bin/mongod(_ZN5mongo11DataFileMgr6insertEPKcPKvibbbPb+0x1153) [0xac5793]
/usr/bin/mongod(_ZN5mongo11DataFileMgr12updateRecordEPKcPNS_16NamespaceDetailsEPNS_25NamespaceDetailsTransientEPNS_6RecordERKNS_7DiskLocES2_iRNS_7OpDebugEb+0x460) [0xac7f10]
/usr/bin/mongod(_ZN5mongo14_updateObjectsEbPKcRKNS_7BSONObjES4_bbbRNS_7OpDebugEPNS_11RemoveSaverEbRKNS_24QueryPlanSelectionPolicyEb+0x1403) [0xa8fbb3]
/usr/bin/mongod(_ZN5mongo13updateObjectsEPKcRKNS_7BSONObjES4_bbbRNS_7OpDebugEbRKNS_24QueryPlanSelectionPolicyE+0xb7) [0xa93847]
/usr/bin/mongod(_ZN5mongo14receivedUpdateERNS_7MessageERNS_5CurOpE+0x4d8) [0x9f6b78]
/usr/bin/mongod(_ZN5mongo16assembleResponseERNS_7MessageERNS_10DbResponseERKNS_11HostAndPortE+0xac8) [0x9fc0f8]
/usr/bin/mongod(_ZN5mongo16MyMessageHandler7processERNS_7MessageEPNS_21AbstractMessagingPortEPNS_9LastErrorE+0x98) [0x6e83a8]
/usr/bin/mongod(_ZN5mongo17PortMessageServer17handleIncomingMsgEPv+0x42e) [0xdccbae]
/lib64/libpthread.so.0() [0x3a00207851]
/lib64/libc.so.6(clone+0x6d) [0x39ffee894d]



 Comments   
Comment by Stennie Steneker (Inactive) [ 16/Dec/13 ]

Hi Igor,

Thanks for confirming the repair was successful. I'm now closing this issue.

Regards,
Stephen

Comment by igor lasic [ 16/Dec/13 ]

The repair action took care of the issues

Please close

Comment by Stennie Steneker (Inactive) [ 16/Dec/13 ]

Hi Igor,

Do you have any update on this issue? Were you able to successfully continue operations after repairing the shards?

Regards,
Stephen

Comment by Eliot Horowitz (Inactive) [ 27/Nov/13 ]

Its hard to check for all possible disk corruption without scanning the entire database, so not sure the best solution.

Do you have backups from before the SAN problem?

Comment by igor lasic [ 27/Nov/13 ]

Yes there is a definite possibility Mongo files were corrupted

That being said db should have detected the corruption on load and not
leave me in a lurch

Comment by Eliot Horowitz (Inactive) [ 27/Nov/13 ]

What kind of SAN failure?
Is possible the SAN corrupted the mongo files?

Comment by igor lasic [ 27/Nov/13 ]

I have a lot of query logging that is sensitive to share. If I strip that
out will the log be useful?

BTW this started after San failure.

I did mongodb repair on my shards and am rerunning my processing to see if
crash happens again.

Comment by Eliot Horowitz (Inactive) [ 26/Nov/13 ]

Can you send the mongod log?
Also, can you check the system log for io errors?

Generated at Thu Feb 08 03:26:56 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.