[SERVER-5497] mongod: double free or corruption Created: 04/Apr/12  Updated: 10/Dec/14  Resolved: 25/Jul/13

Status: Closed
Project: Core Server
Component/s: Stability
Affects Version/s: 2.0.2
Fix Version/s: None

Type: Bug Priority: Critical - P2
Reporter: Grégoire Seux Assignee: Daniel Pasette (Inactive)
Resolution: Incomplete Votes: 2
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

linux x86_64 2.6.18-274.3.1.el5.centos.plus


Attachments: Text File mongo-clean.log     Text File mongod-shard4.extract.log     Text File mongod-shard4.log    
Operating System: Linux
Participants:

 Description   

mongod outputs *** glibc detected *** mongod: double free or corruption (!prev): 0x00002ac5c40af3b0 ***
in the logs.

The replicaset seems to have stopped replication for a long time (that may be another issue).
Primary started output weird errors "double free or corruption". process was still alive (until we kill -9 it) but refuses any connections (shell or with other replicasets members).



 Comments   
Comment by Daniel Pasette (Inactive) [ 09/Apr/13 ]

This issue got lost in the shuffle. Is this still an active problem for any watchers?

Comment by Grégoire Seux [ 10/Jul/12 ]

My setup is not the same as Theo, Andy or Petr.
We ran (at that time) 2.0.2 normal version.

Comment by Andy Schwerin [ 09/Jul/12 ]

So, your logs are from the 2.0.1 regular version? I'll take another look at the symbols.

Comment by Grégoire Seux [ 09/Jul/12 ]

we run on centos5 indeed but not using the legacy-static version. Could it come from somewhere else ?

Comment by Andy Schwerin [ 06/Jul/12 ]

Did some more digging. If you can, you should stop running the legacy-static version of mongo. It looks like you're on Centos5, which supports the regular build of mongo. The legacy-static version tries to statically link against a number of libraries that aren't super safe to statically link against.

I was hoping to use the debug symbols to provide a more detailed analysis, but it seems that we don't have debugging symbols available for the 2.0.2 legacy static build – some kind of gaf during the release process for versions 2.0.0 to 2.0.4.

Comment by Theo [ 15/Jun/12 ]

Also wondering if there is any progress on this. We see this issue fairly regularly.

Comment by Grégoire Seux [ 15/Jun/12 ]

Is there progress on this issue ?

Comment by Theo [ 02/May/12 ]

Here is end of the log from another node that crashed:

Wed May 2 04:02:01 [initandlisten] connection accepted from 10.199.144.35:33373 #40442
Wed May 2 04:02:10 [rsSync] repl: old cursor isDead, will initiate a new one
Wed May 2 04:02:13 [rsSync] replSet syncing to: ahost.adomain.com:27017
Wed May 2 04:02:14 Invalid access at address: 0

Wed May 2 04:02:15 Got signal: 11 (Segmentation fault).

Wed May 2 04:02:16 Backtrace:
0x986fb9 0x987590 0x9a3810 0x3ec666afa2 0x2aaaaaab933f 0x2aaaac08b138 0xb86029
[0x986fb9]
[0x987590]
[0x9a3810]
/lib64/libc.so.6(fgets_unlocked+0x22) [0x3ec666afa2]
/lib64/libnss_files.so.2 [0x2aaaaaab933f]
[0x2aaaac08b138]
[0xb86029]

      • glibc detected *** ./mongod: double free or corruption (!prev): 0x00002aacfc060a90 ***
        ======= Backtrace: =========
        [0xa71c1b]
        [0xa75cc6]
        [0x594641]
        [0x59604f]
        [0x590dc2]
        [0x591021]
        [0x592785]
        [0x4768d2]
        [0x477744]
        [0xa09f69]
        [0x99f9ad]
        [0xa8f549]
        ======= Memory map: ========
        00400000-00c97000 r-xp 00000000 08:01 2261002 /u00/mongo1/bin/mongod
        00e97000-00eb8000 rw-p 00897000 08:01 2261002 /u00/mongo1/bin/mongod
        00eb8000-018d9000 rw-p 00eb8000 00:00 0
        0dd12000-0de8b000 rw-p 0dd12000 00:00 0 [heap]
Comment by Theo [ 02/May/12 ]

Also seeing the same issue frequently.

Here is the tail end of the log right before crash:

Mon Apr 30 17:41:56 [initandlisten] connection accepted from 10.199.144.35:47758 #36268

      • glibc detected *** ./mongod: double free or corruption (top): 0x00002aad000008c0 ***
        Mon Apr 30 17:41:58 Invalid access at address: 0

Mon Apr 30 17:41:58 Got signal: 11 (Segmentation fault).

Mon Apr 30 17:41:58 Backtrace:
0x986fb9 0x987590 0x9a3810
[0x986fb9]
[0x987590]
[0x9a3810]

Here is the top of log after start:

          • SERVER RESTARTED *****

Tue Apr 24 11:23:17 [initandlisten] MongoDB starting : pid=14551 port=27017 dbpath=/Data/u01/mongo1/data/ 64-bit host=ahost.adomain.com
Tue Apr 24 11:23:17 [initandlisten] db version v2.0.2, pdfile version 4.5
Tue Apr 24 11:23:17 [initandlisten] git version: 514b122d308928517f5841888ceaa4246a7f18e3
Tue Apr 24 11:23:17 [initandlisten] build info: Linux domU-12-31-39-16-30-A2 2.6.21.7-2.fc8xen #1 SMP Fri Feb 15 12:34:28 EST 2008 x86_64 BOOST_LIB_VERSION=1_45
Tue Apr 24 11:23:17 [initandlisten] options:

{ config: "mongodb.cnf", dbpath: "/Data/u01/mongo1/data/", directoryperdb: "true", fork: "true", logappend: "true", logpath: "/u00/mongo1/log/mongodb.log", maxConns: 20000, port: 27017, profile: 1, replSet: "replicaset_1", rest: "true", shardsvr: "true" }

Tue Apr 24 11:23:17 [initandlisten] journal dir=/Data/u01/mongo1/data/journal

Comment by Petr Masopust [ 02/May/12 ]

I have similar issue - mongo is crashing with segfault.
Server: Linux server48.company.com 2.6.18-194.11.4.el5 #1 SMP Fri Sep 17 04:57:05 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux
Mongo is 2.0.2 64bit old legacy (i.e. static with all libs included) 3 server replica set.

See mongo-clean.log with stacktrace

Comment by Grégoire Seux [ 13/Apr/12 ]

I am sorry but we don"t have them anymore. It was started a long time ago and we only keep these logs for 7 days.

Comment by Andy Schwerin [ 04/Apr/12 ]

I'm looking for the startup messages. The log you sent does not contain them. Any chance you can start up the same binary again, and send the first 100 lines or so of the log?

Comment by Grégoire Seux [ 04/Apr/12 ]

more complete log

Comment by Andy Schwerin [ 04/Apr/12 ]

Could you send us the first hundred lines or so of the log file from the crashing mongo instance? Or start up mongod again and send us the first 100 lines of log output? I want to make sure I grab the right set of debugging symbols to correlate with that stack trace.

Comment by Grégoire Seux [ 04/Apr/12 ]

correction : we use mongo 2.0.1

Generated at Thu Feb 08 03:09:04 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.