[SERVER-26837] invalid access / segfault Created: 31/Oct/16  Updated: 29/Nov/16  Resolved: 02/Nov/16

Status: Closed
Project: Core Server
Component/s: Admin
Affects Version/s: 3.2.9
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: rfc2822 Assignee: ADAM Martin (Inactive)
Resolution: Incomplete Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: Platforms 2016-11-21
Participants:

 Description   

I have a NodeBB forum (https://nodebb.org/) running with MongoDB. Recently, I have upgraded the server from FreeBSD 10.3 to FreeBSD 11, which required rebuilding all ports, including the MongoDB port. Since updating MongoDB to the latest version, it crashed after a while (it's only used for the forum).

Logs:

2016-10-31T01:56:26.915+0100 F -        [thread1] Invalid access at address: 0x3c
2016-10-31T01:56:26.942+0100 F -        [thread1] Got signal: 11 (Segmentation fault).
 
 0x1260e8b 0x12609bc 0x8036a979d 0x8036a8d6f
----- BEGIN BACKTRACE -----
{"backtrace":[{"b":"400000","o":"E60E8B","s":"_ZN5mongo15printStackTraceERNSt3__113basic_ostreamIcNS0_11char_traitsIcEEEE"},{"b":"400000","o":"E609BC","s":"_ZN5mongo29reportOutOfMemoryErrorAndExitEv"},{"b":"80369B000","o":"E79D","s":"pthread_sigmask"},{"b":"80369B000","o":"DD6F","s":"pthread_getspecific"}],"processInfo":{ "mongodbVersion" : "3.2.9", "gitVersion" : "22ec9e93b40c85fc7cae7d56e7d6a02fd811088c", "compiledModules" : [], "uname" : { "sysname" : "FreeBSD", "release" : "11.0-RELEASE-p2", "version" : "FreeBSD 11.0-RELEASE-p2 #0: Mon Oct 24 06:55:27 UTC 2016     root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC", "machine" : "amd64" } }}
 mongod(_ZN5mongo15printStackTraceERNSt3__113basic_ostreamIcNS0_11char_traitsIcEEEE+0x2B) [0x1260e8b]
 mongod(_ZN5mongo29reportOutOfMemoryErrorAndExitEv+0x30C) [0x12609bc]
 libthr.so.3(pthread_sigmask+0x50D) [0x8036a979d]
 libthr.so.3(pthread_getspecific+0xE9F) [0x8036a8d6f]
-----  END BACKTRACE  -----

# mongod --version
db version v3.2.9
git version: 22ec9e93b40c85fc7cae7d56e7d6a02fd811088c
allocator: system
modules: none
build environment:
    distarch: x86_64
    target_arch: x86_64

Is there anything I can do?



 Comments   
Comment by rfc2822 [ 09/Nov/16 ]

Just wanted to note that we have detected faulty RAM on this server today. Maybe the crash was related to that.

Comment by ADAM Martin (Inactive) [ 02/Nov/16 ]

rfc2822, unfortunately without a coredump, we cannot pursue this issue further. Although FreeBSD is not an officially supported platform, we are willing to investigate this issue further, provided that we can get more information.

In order to get a core dump, please make sure that the `kern.coredump` sysctl is set to 1. You may want to make sure that `kernel.corefile` is set to something reasonable. (such as %N.core at a minimum). Also make sure that `ulimit -c` prints "unlimited", in a shell on the user account in which you run `mongod`.

Thank you for reporting this to us. We will be happy to reopen our investigation, when we have more data to study.

Comment by rfc2822 [ 01/Nov/16 ]

Unfortunately, I don't have a core dump. If this happens again, I will have a look on how to activate it.

Comment by ADAM Martin (Inactive) [ 31/Oct/16 ]

rfc2822, do you have a core file for the crashing mongod instance? That might give us some more information.

Comment by rfc2822 [ 31/Oct/16 ]

mongod.log since startup:

2016-10-29T16:21:34.155+0200 I CONTROL  [main] ***** SERVER RESTARTED *****
2016-10-29T16:21:34.173+0200 I CONTROL  [initandlisten] MongoDB starting : pid=87143 port=27017 dbpath=/var/db/mongodb 64-bit host=tyr.dev001.net
2016-10-29T16:21:34.173+0200 I CONTROL  [initandlisten] db version v3.2.9
2016-10-29T16:21:34.173+0200 I CONTROL  [initandlisten] git version: 22ec9e93b40c85fc7cae7d56e7d6a02fd811088c
2016-10-29T16:21:34.173+0200 I CONTROL  [initandlisten] allocator: system
2016-10-29T16:21:34.173+0200 I CONTROL  [initandlisten] modules: none
2016-10-29T16:21:34.173+0200 I CONTROL  [initandlisten] build environment:
2016-10-29T16:21:34.173+0200 I CONTROL  [initandlisten]     distarch: x86_64
2016-10-29T16:21:34.173+0200 I CONTROL  [initandlisten]     target_arch: x86_64
2016-10-29T16:21:34.173+0200 I CONTROL  [initandlisten] options: { config: "/usr/local/etc/mongodb.conf", net: { bindIp: "127.0.0.1", port: 27017 }, processManagement: { fork: true, pid
FilePath: "/var/db/mongodb/mongod.lock" }, storage: { dbPath: "/var/db/mongodb", engine: "wiredTiger", journal: { enabled: true } }, systemLog: { destination: "file", logAppend: true, p
ath: "/var/db/mongodb/mongod.log" } }
2016-10-29T16:21:34.180+0200 W -        [initandlisten] Detected unclean shutdown - /var/db/mongodb/mongod.lock is not empty.
2016-10-29T16:21:34.180+0200 W STORAGE  [initandlisten] Recovering data from the last clean checkpoint.
2016-10-29T16:21:34.180+0200 I STORAGE  [initandlisten] wiredtiger_open config: create,cache_size=18G,session_max=20000,eviction=(threads_max=4),config_base=false,statistics=(fast),log=
(enabled=true,archive=true,path=journal,compressor=snappy),file_manager=(close_idle_time=100000),checkpoint=(wait=60,log_size=2GB),statistics_log=(wait=0),
2016-10-29T16:21:36.021+0200 I CONTROL  [initandlisten] 
2016-10-29T16:21:36.021+0200 I CONTROL  [initandlisten] ** WARNING: soft rlimits too low. rlimits set to 34174 processes, 938628 files. Number of processes should be at least 469314 : 0
.5 times number of files.
2016-10-29T16:21:36.022+0200 I STORAGE  [initandlisten] Scanning admin db for SERVER-23299 eligibility
2016-10-29T16:21:36.022+0200 I STORAGE  [initandlisten] Done scanning admin for SERVER-23299 eligibility
2016-10-29T16:21:36.022+0200 I STORAGE  [initandlisten] Scanning local db for SERVER-23299 eligibility
2016-10-29T16:21:36.022+0200 I STORAGE  [initandlisten] Done scanning local for SERVER-23299 eligibility
2016-10-29T16:21:36.024+0200 I STORAGE  [initandlisten] Scanning nodebb db for SERVER-23299 eligibility
2016-10-29T16:21:36.024+0200 I STORAGE  [initandlisten] Done scanning nodebb for SERVER-23299 eligibility
2016-10-29T16:21:36.025+0200 I FTDC     [initandlisten] Initializing full-time diagnostic data capture with directory '/var/db/mongodb/diagnostic.data'
2016-10-29T16:21:36.026+0200 I NETWORK  [HostnameCanonicalizationWorker] Starting hostname canonicalization worker
2016-10-29T16:21:36.038+0200 I NETWORK  [initandlisten] waiting for connections on port 27017
2016-10-29T16:22:39.749+0200 I NETWORK  [initandlisten] connection accepted from 127.0.0.1:64732 #1 (1 connection now open)
2016-10-29T16:22:39.757+0200 I NETWORK  [conn1] end connection 127.0.0.1:64732 (0 connections now open)
2016-10-29T16:22:39.760+0200 I NETWORK  [initandlisten] connection accepted from 127.0.0.1:64733 #2 (1 connection now open)
2016-10-29T16:22:39.829+0200 I ACCESS   [conn2] Successfully authenticated as principal nodebb on nodebb
2016-10-29T16:22:40.118+0200 I ACCESS   [conn2] Successfully authenticated as principal nodebb on nodebb
2016-10-29T16:22:40.121+0200 I NETWORK  [initandlisten] connection accepted from 127.0.0.1:64734 #3 (2 connections now open)
2016-10-29T16:22:40.122+0200 I NETWORK  [initandlisten] connection accepted from 127.0.0.1:64735 #4 (3 connections now open)
2016-10-29T16:22:42.701+0200 I ACCESS   [conn4] Successfully authenticated as principal nodebb on nodebb
2016-10-29T16:22:42.701+0200 I ACCESS   [conn3] Successfully authenticated as principal nodebb on nodebb
2016-10-29T16:22:45.703+0200 I COMMAND  [conn2] command nodebb.objects command: find { find: "objects", filter: { _key: "plugins:active" }, sort: { score: 1 }, projection: { _id: 0, val
ue: 1 } } planSummary: IXSCAN { _key: 1, score: -1 } keysExamined:13 docsExamined:13 fromMultiPlanner:1 cursorExhausted:1 keyUpdates:0 writeConflicts:0 numYields:4 nreturned:13 reslen:7
00 locks:{ Global: { acquireCount: { r: 10 } }, Database: { acquireCount: { r: 5 } }, Collection: { acquireCount: { r: 5 } } } protocol:op_query 399ms
2016-10-29T16:22:46.103+0200 I NETWORK  [initandlisten] connection accepted from 127.0.0.1:64745 #5 (4 connections now open)
2016-10-29T16:22:46.190+0200 I ACCESS   [conn5] Successfully authenticated as principal nodebb on nodebb
2016-10-29T16:22:53.864+0200 I NETWORK  [initandlisten] connection accepted from 127.0.0.1:64764 #6 (5 connections now open)
2016-10-29T16:22:53.914+0200 I ACCESS   [conn6] Successfully authenticated as principal nodebb on nodebb
2016-10-29T16:22:54.430+0200 I COMMAND  [conn6] command nodebb.sessions command: find { find: "sessions", filter: { _id: "xxxxxxxxxxxxxxxxxxxxxxxxxx", $or: [ { expires: { $exists:
 false } }, { expires: { $gt: new Date(1477750974320) } } ] }, limit: 1, batchSize: 1, singleBatch: true } planSummary: IXSCAN { _id: 1 } keysExamined:1 docsExamined:1 cursorExhausted:1
 keyUpdates:0 writeConflicts:0 numYields:0 nreturned:1 reslen:396 locks:{ Global: { acquireCount: { r: 2 } }, Database: { acquireCount: { r: 1 } }, Collection: { acquireCount: { r: 1 } 
} } protocol:op_query 109ms
2016-10-29T16:22:56.419+0200 I NETWORK  [initandlisten] connection accepted from 127.0.0.1:64780 #7 (6 connections now open)
2016-10-29T16:22:56.420+0200 I NETWORK  [initandlisten] connection accepted from 127.0.0.1:64781 #8 (7 connections now open)
2016-10-29T16:22:56.420+0200 I NETWORK  [initandlisten] connection accepted from 127.0.0.1:64782 #9 (8 connections now open)
2016-10-29T16:22:56.420+0200 I NETWORK  [initandlisten] connection accepted from 127.0.0.1:64783 #10 (9 connections now open)
2016-10-29T16:22:56.420+0200 I NETWORK  [initandlisten] connection accepted from 127.0.0.1:64784 #11 (10 connections now open)
2016-10-29T16:22:56.533+0200 I ACCESS   [conn8] Successfully authenticated as principal nodebb on nodebb
2016-10-29T16:22:56.533+0200 I ACCESS   [conn9] Successfully authenticated as principal nodebb on nodebb
2016-10-29T16:22:56.533+0200 I ACCESS   [conn7] Successfully authenticated as principal nodebb on nodebb
2016-10-29T16:22:56.534+0200 I ACCESS   [conn10] Successfully authenticated as principal nodebb on nodebb
2016-10-29T16:22:56.537+0200 I ACCESS   [conn11] Successfully authenticated as principal nodebb on nodebb
2016-10-29T16:22:56.679+0200 I COMMAND  [conn8] command nodebb.objects command: find { find: "objects", filter: { _key: "cid:4:children" }, sort: { score: 1 }, projection: { _id: 0, value: 1 } } planSummary: IXSCAN { _key: 1, score: -1 } keysExamined:5 docsExamined:5 cursorExhausted:1 keyUpdates:0 writeConflicts:0 numYields:3 nreturned:5 reslen:226 locks:{ Global: { acquireCount: { r: 8 } }, Database: { acquireCount: { r: 4 } }, Collection: { acquireCount: { r: 4 } } } protocol:op_query 140ms
2016-10-29T16:22:56.691+0200 I COMMAND  [conn3] command nodebb.objects command: find { find: "objects", filter: { _key: { $in: [ "cid:4:read_by_uid", "cid:5:read_by_uid", "cid:13:read_by_uid", "cid:3:read_by_uid", "cid:2:read_by_uid" ] }, members: "1" }, projection: { _id: 0, members: 0 } } planSummary: IXSCAN { _key: 1, score: -1 } keysExamined:10 docsExamined:5 fromMultiPlanner:1 cursorExhausted:1 keyUpdates:0 writeConflicts:0 numYields:4 nreturned:5 reslen:301 locks:{ Global: { acquireCount: { r: 10 } }, Database: { acquireCount: { r: 5 } }, Collection: { acquireCount: { r: 5 } } } protocol:op_query 148ms
2016-10-29T16:24:33.037+0200 I COMMAND  [conn11] command nodebb.objects command: find { find: "objects", filter: { _key: "cid:5:tids" }, sort: { score: -1 }, projection: { _id: 0, value: 1 }, limit: 26 } planSummary: IXSCAN { _key: 1, score: -1 } keysExamined:26 docsExamined:26 fromMultiPlanner:1 replanned:1 cursorExhausted:1 keyUpdates:0 writeConflicts:0 numYields:10 nreturned:26 reslen:748 locks:{ Global: { acquireCount: { r: 22 } }, Database: { acquireCount: { r: 11 } }, Collection: { acquireCount: { r: 11 } } } protocol:op_query 223ms
2016-10-29T17:02:12.302+0200 I COMMAND  [conn6] command nodebb.sessions command: find { find: "sessions", filter: { _id: "xxxxxxxxxxxxxxxxxxxxxxxxx", $or: [ { expires: { $exists: false } }, { expires: { $gt: new Date(1477753332180) } } ] }, limit: 1, batchSize: 1, singleBatch: true } planSummary: IXSCAN { _id: 1 } keysExamined:1 docsExamined:1 cursorExhausted:1 keyUpdates:0 writeConflicts:0 numYields:0 nreturned:1 reslen:320 locks:{ Global: { acquireCount: { r: 2 } }, Database: { acquireCount: { r: 1 } }, Collection: { acquireCount: { r: 1 } } } protocol:op_query 121ms
2016-10-29T17:08:16.476+0200 I COMMAND  [conn4] command nodebb.objects command: find { find: "objects", filter: { _key: "tid:24:posts" }, sort: { score: 1 }, projection: { _id: 0, value: 1 }, limit: 20 } planSummary: IXSCAN { _key: 1, score: -1 } keysExamined:3 docsExamined:3 cursorExhausted:1 keyUpdates:0 writeConflicts:0 numYields:1 nreturned:3 reslen:189 locks:{ Global: { acquireCount: { r: 4 } }, Database: { acquireCount: { r: 2 } }, Collection: { acquireCount: { r: 2 } } } protocol:op_query 124ms
2016-10-29T18:58:55.446+0200 I COMMAND  [conn5] command nodebb.searchpost command: aggregate { aggregate: "searchpost", pipeline: [ { $match: { $text: { $search: "reminders" } } }, { $sort: { score: { $meta: "textScore" } } }, { $limit: 50 }, { $project: { _id: 0, id: 1 } } ], cursor: { batchSize: 1000 } } keyUpdates:0 writeConflicts:0 numYields:4 reslen:521 locks:{ Global: { acquireCount: { r: 16 } }, Database: { acquireCount: { r: 8 } }, Collection: { acquireCount: { r: 8 } } } protocol:op_query 137ms
2016-10-29T19:11:13.186+0200 I WRITE    [conn9] update nodebb.searchpost query: { id: 7438 } update: { $set: { id: 7438, content: "Hi,
 
I have an issue with latest DAVdroid but I'm a bit unsure when it started. I just switched phone and thus did a new DAVdroid install on the new o...", cid: "4", uid: "1415" } } keysExamined:0 docsExamined:0 nMatched:1 nModified:1 upsert:1 keyUpdates:0 writeConflicts:0 numYields:1 locks:{ Global: { acquireCount: { r: 2, w: 2 } }, Database: { acquireCount: { w: 2 } }, Collection: { acquireCount: { w: 2 } } } 571ms
2016-10-29T19:11:13.186+0200 I COMMAND  [conn9] command nodebb.$cmd command: update { update: "searchpost", updates: [ { q: { id: 7438 }, u: { $set: { id: 7438, content: "Hi,
 
I have an issue with latest DAVdroid but I'm a bit unsure when it started. I just switched phone and thus did a new DAVdroid install on the new o...", cid: "4", uid: "1415" } }, multi: false, upsert: true } ], ordered: false, writeConcern: {} } keyUpdates:0 writeConflicts:0 numYields:0 reslen:106 locks:{ Global: { acquireCount: { r: 2, w: 2 } }, Database: { acquireCount: { w: 2 } }, Collection: { acquireCount: { w: 2 } } } protocol:op_query 571ms
2016-10-30T01:08:52.516+0200 I COMMAND  [conn6] command nodebb.searchpost command: aggregate { aggregate: "searchpost", pipeline: [ { $match: { $text: { $search: "android.content.OperationApplicationException: App op not allowed" }, cid: { $in: [ "6", "7", "8", "9", "11", "4" ] } } }, { $sort: { score: { $meta: "textScore" } } }, { $limit: 50 }, { $project: { _id: 0, id: 1 } } ], cursor: { batchSize: 1000 } } keyUpdates:0 writeConflicts:0 numYields:47 reslen:963 locks:{ Global: { acquireCount: { r: 102 } }, Database: { acquireCount: { r: 51 } }, Collection: { acquireCount: { r: 51 } } } protocol:op_query 117ms
2016-10-31T01:56:26.915+0100 F -        [thread1] Invalid access at address: 0x3c
2016-10-31T01:56:26.942+0100 F -        [thread1] Got signal: 11 (Segmentation fault).

You can find the port here:
https://svnweb.freebsd.org/ports/head/databases/mongodb32/
(especially the Makefile)

I think it uses the CLang compiler which is default on FreeBSD:

$ cc --version
FreeBSD clang version 3.8.0 (tags/RELEASE_380/final 262564) (based on LLVM 3.8.0)
Target: x86_64-unknown-freebsd11.0
Thread model: posix
InstalledDir: /usr/bin

Make flags:

CPUTYPE=haswell
 
CFLAGS+=	-pipe -O2 -msse4
CPPFLAGS+=	-pipe -O2 -msse4

Comment by Ramon Fernandez Marina [ 31/Oct/16 ]

rfc2822, can you please send the logs from startup until you see the stack trace above?

Also I'm not familiar with FreeBSD 11, so can you provide details about the toolchain used to build this binary? Compiler version, etc.

Thanks,
Ramón.

Generated at Thu Feb 08 04:13:21 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.