[SERVER-26889] Incorrect memory access on 3.0.13 triggers segmentation fault Created: 02/Nov/16  Updated: 20/Jan/17  Resolved: 03/Nov/16

Status: Closed
Project: Core Server
Component/s: WiredTiger
Affects Version/s: 3.0.13
Fix Version/s: 3.0.14

Type: Bug Priority: Critical - P2
Reporter: Burak Ozyurt Assignee: Keith Bostic (Inactive)
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Duplicate
is duplicated by SERVER-26890 Mongodb 3.0.13 crashed by Segmentatio... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Participants:
Case:

 Description   
Issue Status as of Nov 03, 2016

ISSUE DESCRIPTION AND IMPACT

MongoDB 3.0.13 introduced a race condition in the WiredTiger engine that, under specific circumstances, may cause an invalid memory access (segmentation fault). This generally exhibits as an invalid memory access during a checkpoint operation (with __wt_checkpoint) in the stack trace.

This bug is most likely to fire in workloads that are doing concurrent, out-of-order insert operations. In particular, inserting MongoDB documents into a collection from multiple clients concurrently with an index on a field that is out-of-order with inserts may trigger this race in the index.

DIAGNOSIS AND AFFECTED VERSIONS

This bug is present in 3.0.13 only, and it may only affect users running MongoDB with the WiredTiger engine. No other versions are affected, and MMAPv1 users can't be affected by this bug.

REMEDIATION AND WORKAROUNDS

There are no known workarounds for this issue. The only way to be sure to avoid this bug is to upgrade to 3.0.14.

Original description

server version 3.0.13 running on Amazon AMI EC2 node. Starting seeing
Got signal: 11 (Segmentation fault). errors (Three so far in last 24 hours).
Have not seen this error in previous versions. The server run without a crash about a year on previous versions and handles long running (many hours, sometimes days) batch processes.

# mongod.conf
 
systemLog:
  destination: file
  logAppend: true
  path: /var/log/mongodb/mongod.log
 
# Where and how to store data.
storage:
   dbPath: /data/foundry-mongo
   engine: "wiredTiger"
 
# how the process runs
processManagement:
  fork: true  # fork and run in background
  pidFilePath: /var/run/mongodb/mongod.pid  # location of pidfile

Stack Trace:

2016-11-02T21:07:02.954+0000 F -        Invalid access at address: 0xd7
2016-11-02T21:07:02.960+0000 F -        Got signal: 11 (Segmentation fault).
 
 0xfc6b02 0xfc63b3 0xfc6714 0x7f7d78a80100 0x7f7d77885d99 0x14033f0 0x1404542 0x13949f9 0x14268b2 0x14256d8 0x1427365 0x141a04c 0x13ae4dc 0x7f7d78a78dc5 0x7f7d7782ec9d
----- BEGIN BACKTRACE -----
{"backtrace":[{"b":"400000","o":"BC6B02","s":"_ZN5mongo15printStackTraceERSo"},{"b":"400000","o":"BC63B3"},{"b":"400000","o":"BC6714"},{"b":"7F7D78A71000","o":"F100"},{"b":"7F7D77738000","o":"14DD99"},{"b":"400000","o":"10033F0"},{"b":"400000","o":"1004542","s":"__wt_reconcile"},{"b":"400000","o":"F949F9","s":"__wt_cache_op"},{"b":"400000","o":"10268B2","s":"__wt_checkpoint"},{"b":"400000","o":"10256D8"},{"b":"400000","o":"1027365","s":"__wt_txn_checkpoint"},{"b":"400000","o":"101A04C"},{"b":"400000","o":"FAE4DC"},{"b":"7F7D78A71000","o":"7DC5"},{"b":"7F7D77738000","o":"F6C9D","s":"clone"}],"processInfo":{ "mongodbVersion" : "3.0.13", "gitVersion" : "44ae815066cdee7127ddeff34d3a04d75378fd61", "uname" : { "sysname" : "Linux", "release" : "4.4.19-29.55.amzn1.x86_64", "version" : "#1 SMP Mon Aug 29 23:29:40 UTC 2016", "machine" : "x86_64" }, "somap" : [ { "elfType" : 2, "b" : "400000", "buildId" : "2DFFFF43E9109891916A1167D058F32FF816CD02" }, { "b" : "7FFE2B376000", "elfType" : 3, "buildId" : "7EB0FBEF551697268DACDFC3E53A63ADDF93D154" }, { "b" : "7F7D78A71000", "path" : "/lib64/libpthread.so.0", "elfType" : 3, "buildId" : "0836319AA81CDFE97DA2666963F62DE6A2A61346" }, { "b" : "7F7D78804000", "path" : "/usr/lib64/libssl.so.10", "elfType" : 3, "buildId" : "6AF827B6FD200DFDFE70B2BC8D66BBC9881E8817" }, { "b" : "7F7D7841E000", "path" : "/lib64/libcrypto.so.10", "elfType" : 3, "buildId" : "83F15DBCD0653F417E98354BC1EED6F96A758367" }, { "b" : "7F7D78216000", "path" : "/lib64/librt.so.1", "elfType" : 3, "buildId" : "2B3151901240D9E854E18E6D0B181C4D580ABA9C" }, { "b" : "7F7D78012000", "path" : "/lib64/libdl.so.2", "elfType" : 3, "buildId" : "6335077ACD51527BE9F2F18451A88E2B7350C5B6" }, { "b" : "7F7D77D10000", "path" : "/lib64/libm.so.6", "elfType" : 3, "buildId" : "6E343508D15886FE83C438DF4560CE40BEB64B56" }, { "b" : "7F7D77AFA000", "path" : "/lib64/libgcc_s.so.1", "elfType" : 3, "buildId" : "3FD5F89DE59E124AB1419A0BD16775B4096E84FD" }, { "b" : "7F7D77738000", "path" : "/lib64/libc.so.6", "elfType" : 3, "buildId" : "5D38A77E8D79E98D717281031C39B9A341323BD1" }, { "b" : "7F7D78C8D000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "93D931BA041229929E5F099514B20E36A70BD651" }, { "b" : "7F7D774EC000", "path" : "/lib64/libgssapi_krb5.so.2", "elfType" : 3, "buildId" : "E203354E7F907ACC8C3028FE465541B666DCFBA0" }, { "b" : "7F7D77207000", "path" : "/lib64/libkrb5.so.3", "elfType" : 3, "buildId" : "D769C8FFAF8772FDA55031ABF2E167DF2207E378" }, { "b" : "7F7D77004000", "path" : "/usr/lib64/libcom_err.so.2", "elfType" : 3, "buildId" : "5C01209C5AE1B1714F19B07EB58F2A1274B69DC8" }, { "b" : "7F7D76DD2000", "path" : "/lib64/libk5crypto.so.3", "elfType" : 3, "buildId" : "6C2243D37143F7FD1E16ED1F6CE4D7F16C2D7EF1" }, { "b" : "7F7D76BBC000", "path" : "/lib64/libz.so.1", "elfType" : 3, "buildId" : "89C6AF118B6B4FB6A73AE1813E2C8BDD722956D1" }, { "b" : "7F7D769AD000", "path" : "/lib64/libkrb5support.so.0", "elfType" : 3, "buildId" : "744272FEAAABCE629AB9E11FAA4A96AEBE8BC2B4" }, { "b" : "7F7D767AA000", "path" : "/lib64/libkeyutils.so.1", "elfType" : 3, "buildId" : "37A58210FA50C91E09387765408A92909468D25B" }, { "b" : "7F7D76590000", "path" : "/lib64/libresolv.so.2", "elfType" : 3, "buildId" : "1285F9516FFCF13FC00BD135C5634AF2EB16C80B" }, { "b" : "7F7D7636F000", "path" : "/usr/lib64/libselinux.so.1", "elfType" : 3, "buildId" : "F5054DC94443326819FBF3065CFDF5E4726F57EE" } ] }}
 mongod(_ZN5mongo15printStackTraceERSo+0x32) [0xfc6b02]
 mongod(+0xBC63B3) [0xfc63b3]
 mongod(+0xBC6714) [0xfc6714]
 libpthread.so.0(+0xF100) [0x7f7d78a80100]
libc.so.6(+0x14DD99) [0x7f7d77885d99]
 mongod(+0x10033F0) [0x14033f0]
 mongod(__wt_reconcile+0x6D2) [0x1404542]
 mongod(__wt_cache_op+0x4E9) [0x13949f9]
 mongod(__wt_checkpoint+0x992) [0x14268b2]
 mongod(+0x10256D8) [0x14256d8]
 mongod(__wt_txn_checkpoint+0x8B5) [0x1427365]
 mongod(+0x101A04C) [0x141a04c]
 mongod(+0xFAE4DC) [0x13ae4dc]
 libpthread.so.0(+0x7DC5) [0x7f7d78a78dc5]
 libc.so.6(clone+0x6D) [0x7f7d7782ec9d]



 Comments   
Comment by Burak Ozyurt [ 03/Nov/16 ]

I have upgraded to 3.2.10 and have not encountered any issue so far.
Thanks for the quick reply.


I. Burak Ozyurt PhD
Project Scientist
University of California, San Diego
9500 Gilman Drive, M/C 0608
La Jolla, CA 92093-0608

Comment by Ramon Fernandez Marina [ 03/Nov/16 ]

bozyurt, we've identified the root cause for this bug and checked in a fix that will be available very soon.

As I mentioned above, the workarounds are to either downgrade to 3.0.12, or to upgrade to 3.2.10, neither of which contains this bug.

Sorry that you were impacted by this, and thanks for reporting it.

Regards,
Ramón.

Comment by Githook User [ 03/Nov/16 ]

Author:

{u'name': u'Ramon Fernandez', u'email': u'ramon@mongodb.com'}

Message: Import wiredtiger: b1aab8db7d80e165d5da80aab0c0403772450997 from branch mongodb-3.0

ref: a5c67bd..b1aab8db7d
for: 3.0.14

SERVER-26889 Incorrect memory access on 3.0.13 triggers segmentation fault
WT-2711 Change statistics log configuration options
Branch: v3.0
https://github.com/mongodb/mongo/commit/08352afcca24bfc145240a0fac9d28b978ab77f3

Comment by Githook User [ 03/Nov/16 ]

Author:

{u'username': u'keithbostic', u'name': u'Keith Bostic', u'email': u'keith.bostic@mongodb.com'}

Message: SERVER-26889 Got signal: 11 (Segmentation fault) in 3.0.13 (#3125)

Fix 552a33b (cherry-picked from 521270d). When the commit was merged,
a line was dropped.
Branch: mongodb-3.0
https://github.com/wiredtiger/wiredtiger/commit/b1aab8db7d80e165d5da80aab0c0403772450997

Comment by Keith Bostic (Inactive) [ 03/Nov/16 ]

bozyurt, I think we understand this problem and likely won't need your database files. Thank you!

Comment by Ramon Fernandez Marina [ 03/Nov/16 ]

bozyurt, I just realized we may need to see the files in your dbpath to troubleshoot this issue, so I've created a private, secure upload portal for you to upload them. They will remain accessible only to MongoDB engineers for the purpose of investigating this bug, and deleted afterwards.

Thanks,
Ramón.

Comment by Ramon Fernandez Marina [ 02/Nov/16 ]

Sorry to hear you're running into this bozyurt, we're investigating. As a workaround, you may want to consider downgrading to 3.0.12 if the problem doesn't manifest there. The alternative is to upgrade to 3.2.10.

thanks,
Ramón.

Generated at Thu Feb 08 04:13:31 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.