[SERVER-18757] Any attempt to access a few records crashes Mongodb Created: 31/May/15  Updated: 04/Aug/15  Resolved: 01/Jul/15

Status: Closed
Project: Core Server
Component/s: WiredTiger
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Homam Hosseini Assignee: Ramon Fernandez Marina
Resolution: Incomplete Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Ubuntu 14.04
db version v3.0.3

  1. mongodb.conf
    dbpath=/one/data
    logpath=/four/log/mongodb/mongodb.log
    logappend=true
    storageEngine=wiredTiger
    journal=true

Operating System: ALL
Participants:

 Description   

Thera are 129 documents that match this (within 1 second):

db.events.count({_id: {$gt: ObjectId("556a13fa0000000000000000"), $lt: ObjectId("556a13fb0000000000000000")}})

Other than counting them, any other attempt to access or remove these records causes Mongodb to crash, for example:

db.events.find({_id: {$gt: ObjectId("556a13fa0000000000000000"), $lt: ObjectId("556a13fb0000000000000000")}}, {_id: 1})

or

db.events.remove({_id: {$gt: ObjectId("556a13fa0000000000000000"), $lt: ObjectId("556a13fb0000000000000000")}})

And here's the log file:

2015-05-31T10:38:05.570+0400 I CONTROL  [initandlisten] db version v3.0.3
2015-05-31T10:38:05.570+0400 I CONTROL  [initandlisten] git version: b40106b36eecd1b4407eb1ad1af6bc60593c6105
2015-05-31T10:38:05.570+0400 I CONTROL  [initandlisten] OpenSSL version: OpenSSL 1.0.1f 6 Jan 2014
2015-05-31T10:38:05.570+0400 I CONTROL  [initandlisten] build info: Linux ip-10-225-179-153 3.13.0-24-generic #46-Ubuntu SMP Thu Apr 10 19:11:08 UTC 2014 x86_64 BOOST_LIB_VERSION=1_49
2015-05-31T10:38:05.570+0400 I CONTROL  [initandlisten] allocator: tcmalloc
2015-05-31T10:38:05.570+0400 I CONTROL  [initandlisten] options: { config: "/etc/mongodb.conf", processManagement: { fork: true }, storage: { dbPath: "/one/data", engine: "wiredTiger", journal: { enabled: true }, wiredTiger: { engineConfig: { configString: "hazard_max=10000" } } }, systemLog: { destination: "file", logAppend: true, path: "/four/log/mongodb/mongodb.log" } }
2015-05-31T10:38:05.571+0400 I NETWORK  [initandlisten] waiting for connections on port 27017
2015-05-31T10:38:22.828+0400 I NETWORK  [initandlisten] connection accepted from 127.0.0.1:37183 #1 (1 connection now open)
2015-05-31T10:39:20.255+0400 E STORAGE  [conn1] WiredTiger (0) [1433054360:255425][918:0x7fe4eb241700], file:collection-7--1927339124377144891.wt, cursor.search: read checksum error [8192B @ 581437317120, 3160180044 != 2605186324]
2015-05-31T10:39:20.255+0400 E STORAGE  [conn1] WiredTiger (0) [1433054360:255469][918:0x7fe4eb241700], file:collection-7--1927339124377144891.wt, cursor.search: collection-7--1927339124377144891.wt: encountered an illegal file format or internal value
2015-05-31T10:39:20.255+0400 E STORAGE  [conn1] WiredTiger (-31804) [1433054360:255476][918:0x7fe4eb241700], file:collection-7--1927339124377144891.wt, cursor.search: the process must exit and restart: WT_PANIC: WiredTiger library panic
2015-05-31T10:39:20.255+0400 I -        [conn1] Fatal Assertion 28558
2015-05-31T10:39:20.264+0400 I CONTROL  [conn1] 
 0xf51949 0xef1671 0xed6261 0xd7b2ba 0x13816c9 0x1381885 0x1381d24 0x12d7aa2 0x12f08ee 0x12f52f5 0x12f25f3 0x130cf29 0x12e4808 0x1323c13 0xd68f29 0x9132a0 0xa0507b 0xa01ad0 0xbd26a4 0xbd2a54 0xbd308d 0x9b3f27 0x9b45fb 0x9b4cab 0x9b764d 0x9dadb4 0x9dbd3d 0x9dca4b 0xba0d96 0xab72b0 0x80e88d 0xf04a6b 0x7fe4f2880182 0x7fe4f134847d
----- BEGIN BACKTRACE -----
{"backtrace":[{"b":"400000","o":"B51949"},{"b":"400000","o":"AF1671"},{"b":"400000","o":"AD6261"},{"b":"400000","o":"97B2BA"},{"b":"400000","o":"F816C9"},{"b":"400000","o":"F81885"},{"b":"400000","o":"F81D24"},{"b":"400000","o":"ED7AA2"},{"b":"400000","o":"EF08EE"},{"b":"400000","o":"EF52F5"},{"b":"400000","o":"EF25F3"},{"b":"400000","o":"F0CF29"},{"b":"400000","o":"EE4808"},{"b":"400000","o":"F23C13"},{"b":"400000","o":"968F29"},{"b":"400000","o":"5132A0"},{"b":"400000","o":"60507B"},{"b":"400000","o":"601AD0"},{"b":"400000","o":"7D26A4"},{"b":"400000","o":"7D2A54"},{"b":"400000","o":"7D308D"},{"b":"400000","o":"5B3F27"},{"b":"400000","o":"5B45FB"},{"b":"400000","o":"5B4CAB"},{"b":"400000","o":"5B764D"},{"b":"400000","o":"5DADB4"},{"b":"400000","o":"5DBD3D"},{"b":"400000","o":"5DCA4B"},{"b":"400000","o":"7A0D96"},{"b":"400000","o":"6B72B0"},{"b":"400000","o":"40E88D"},{"b":"400000","o":"B04A6B"},{"b":"7FE4F2878000","o":"8182"},{"b":"7FE4F124E000","o":"FA47D"}],"processInfo":{ "mongodbVersion" : "3.0.3", "gitVersion" : "b40106b36eecd1b4407eb1ad1af6bc60593c6105", "uname" : { "sysname" : "Linux", "release" : "3.16.0-30-generic", "version" : "#40~14.04.1-Ubuntu SMP Thu Jan 15 17:43:14 UTC 2015", "machine" : "x86_64" }, "somap" : [ { "elfType" : 2, "b" : "400000", "buildId" : "F56F80CB96B4DBFC070BEB0ADAC7D6B274BFC6B1" }, { "b" : "7FFFE0BFC000", "elfType" : 3, "buildId" : "C8BA9F3BA421CFBAE75F7E57F357B1B5431DE838" }, { "b" : "7FE4F2878000", "path" : "/lib/x86_64-linux-gnu/libpthread.so.0", "elfType" : 3, "buildId" : "9318E8AF0BFBE444731BB0461202EF57F7C39542" }, { "b" : "7FE4F261A000", "path" : "/lib/x86_64-linux-gnu/libssl.so.1.0.0", "elfType" : 3, "buildId" : "FF43D0947510134A8A494063A3C1CF3CEBB27791" }, { "b" : "7FE4F223F000", "path" : "/lib/x86_64-linux-gnu/libcrypto.so.1.0.0", "elfType" : 3, "buildId" : "B927879B878D90DD9FF4B15B00E7799AA8E0272F" }, { "b" : "7FE4F2037000", "path" : "/lib/x86_64-linux-gnu/librt.so.1", "elfType" : 3, "buildId" : "92FCF41EFE012D6186E31A59AD05BDBB487769AB" }, { "b" : "7FE4F1E33000", "path" : "/lib/x86_64-linux-gnu/libdl.so.2", "elfType" : 3, "buildId" : "C1AE4CB7195D337A77A3C689051DABAA3980CA0C" }, { "b" : "7FE4F1B2F000", "path" : "/usr/lib/x86_64-linux-gnu/libstdc++.so.6", "elfType" : 3, "buildId" : "19EFDDAB11B3BF5C71570078C59F91CF6592CE9E" }, { "b" : "7FE4F1829000", "path" : "/lib/x86_64-linux-gnu/libm.so.6", "elfType" : 3, "buildId" : "1D76B71E905CB867B27CEF230FCB20F01A3178F5" }, { "b" : "7FE4F1613000", "path" : "/lib/x86_64-linux-gnu/libgcc_s.so.1", "elfType" : 3, "buildId" : "8D0AA71411580EE6C08809695C3984769F25725B" }, { "b" : "7FE4F124E000", "path" : "/lib/x86_64-linux-gnu/libc.so.6", "elfType" : 3, "buildId" : "30C94DC66A1FE95180C3D68D2B89E576D5AE213C" }, { "b" : "7FE4F2A96000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "9F00581AB3C73E3AEA35995A0C50D24D59A01D47" } ] }}
 mongod(_ZN5mongo15printStackTraceERSo+0x29) [0xf51949]
 mongod(_ZN5mongo10logContextEPKc+0xE1) [0xef1671]
 mongod(_ZN5mongo13fassertFailedEi+0x61) [0xed6261]
 mongod(+0x97B2BA) [0xd7b2ba]
 mongod(__wt_eventv+0x489) [0x13816c9]
 mongod(__wt_err+0x95) [0x1381885]
 mongod(__wt_panic+0x24) [0x1381d24]
 mongod(__wt_bm_read+0x72) [0x12d7aa2]
 mongod(__wt_bt_read+0x7E) [0x12f08ee]
 mongod(__wt_cache_read+0x1C5) [0x12f52f5]
 mongod(__wt_page_in_func+0x403) [0x12f25f3]
 mongod(__wt_row_search+0xA59) [0x130cf29]
 mongod(__wt_btcur_search+0x678) [0x12e4808]
 mongod(+0xF23C13) [0x1323c13]
 mongod(_ZNK5mongo21WiredTigerRecordStore7dataForEPNS_16OperationContextERKNS_8RecordIdE+0x69) [0xd68f29]
 mongod(_ZNK5mongo10Collection6docForEPNS_16OperationContextERKNS_8RecordIdE+0x20) [0x9132a0]
 mongod(_ZN5mongo10FetchStage4workEPm+0x2DB) [0xa0507b]
 mongod(_ZN5mongo11DeleteStage4workEPm+0x70) [0xa01ad0]
 mongod(_ZN5mongo12PlanExecutor18getNextSnapshottedEPNS_11SnapshottedINS_7BSONObjEEEPNS_8RecordIdE+0xA4) [0xbd26a4]
 mongod(_ZN5mongo12PlanExecutor7getNextEPNS_7BSONObjEPNS_8RecordIdE+0x34) [0xbd2a54]
 mongod(_ZN5mongo12PlanExecutor11executePlanEv+0x3D) [0xbd308d]
 mongod(_ZN5mongo18WriteBatchExecutor10execRemoveERKNS_12BatchItemRefEPPNS_16WriteErrorDetailE+0x4A7) [0x9b3f27]
 mongod(_ZN5mongo18WriteBatchExecutor11bulkExecuteERKNS_21BatchedCommandRequestERKNS_19WriteConcernOptionsEPSt6vectorIPNS_19BatchedUpsertDetailESaIS9_EEPS7_IPNS_16WriteErrorDetailESaISE_EE+0xCB) [0x9b45fb]
 mongod(_ZN5mongo18WriteBatchExecutor12executeBatchERKNS_21BatchedCommandRequestEPNS_22BatchedCommandResponseE+0x37B) [0x9b4cab]
 mongod(_ZN5mongo8WriteCmd3runEPNS_16OperationContextERKSsRNS_7BSONObjEiRSsRNS_14BSONObjBuilderEb+0x15D) [0x9b764d]
 mongod(_ZN5mongo12_execCommandEPNS_16OperationContextEPNS_7CommandERKSsRNS_7BSONObjEiRSsRNS_14BSONObjBuilderEb+0x34) [0x9dadb4]
 mongod(_ZN5mongo7Command11execCommandEPNS_16OperationContextEPS0_iPKcRNS_7BSONObjERNS_14BSONObjBuilderEb+0xC1D) [0x9dbd3d]
 mongod(_ZN5mongo12_runCommandsEPNS_16OperationContextEPKcRNS_7BSONObjERNS_11_BufBuilderINS_16TrivialAllocatorEEERNS_14BSONObjBuilderEbi+0x28B) [0x9dca4b]
 mongod(_ZN5mongo8runQueryEPNS_16OperationContextERNS_7MessageERNS_12QueryMessageERKNS_15NamespaceStringERNS_5CurOpES3_+0x746) [0xba0d96]
 mongod(_ZN5mongo16assembleResponseEPNS_16OperationContextERNS_7MessageERNS_10DbResponseERKNS_11HostAndPortE+0xB10) [0xab72b0]
 mongod(_ZN5mongo16MyMessageHandler7processERNS_7MessageEPNS_21AbstractMessagingPortEPNS_9LastErrorE+0xDD) [0x80e88d]
 mongod(_ZN5mongo17PortMessageServer17handleIncomingMsgEPv+0x34B) [0xf04a6b]
 libpthread.so.0(+0x8182) [0x7fe4f2880182]
 libc.so.6(clone+0x6D) [0x7fe4f134847d]
-----  END BACKTRACE  -----
2015-05-31T10:39:20.264+0400 I -        [conn1] 
 
***aborting after fassert() failure



 Comments   
Comment by Ramon Fernandez Marina [ 15/Jun/15 ]

homam, we haven't heard back from you for a while. If this is still an issue for you, can you please upload the full server logs? You also may want to try running repairDatabase after backing up your database files to see if that clears the specific issue you're seeing.

Thanks,
Ramón.

Comment by Ramon Fernandez Marina [ 01/Jun/15 ]

homam, can you also please upload the full logs for the node where you experienced this issue? We're specially interested in knowing if mongod may not have been shut down cleanly at any point. If the logs are sensitive you can upload them via scp as described above.

Thanks,
Ramón.

Comment by Keith Bostic (Inactive) [ 31/May/15 ]

homam, can you tell us a little bit about the history of this collection?

With what version of MongoDB it was created, if there is anything special about its history?
Specifically, is it compressed in any way?

Thanks,
Keith.

Comment by Ramon Fernandez Marina [ 31/May/15 ]

homam, I've moved your ticket to the SERVER project, as this is an issue with MongoDB and not with stand-alone WiredTiger.

Looks like one of your files got corrupted, and accessing the corrupt records prompts mongod to shut down to avoid any further damage to your data. We'll need some of the affected files to troubleshoot this issue. Can you please upload the following files?

  1. collection-7--1927339124377144891.wt
  2. WiredTiger.turtle
  3. WiredTiger.wt

To upload these files privately you can use scp as follows:

tar czf files.tgz collection-7--1927339124377144891.wt WiredTiger.turtle WiredTiger.wt
scp -P 722 -r files.tgz SERVER-18757@www.mongodb.com:

When prompted for a password just press enter.

This issue may also have been caused by a flaky disk, so please check your system logs and/or disks' SMART status (if applicable) for error messages about the storage layer. We'll take a look at the files you upload to determine if there's a bug in mongod that may have triggered this issue.

Thanks,
Ramón.

Generated at Thu Feb 08 03:48:39 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.