[SERVER-11666] open/create failed in createPrivateMap on SLES11 during rs.initiate() Created: 12/Nov/13  Updated: 20/Nov/13  Resolved: 12/Nov/13

Status: Closed
Project: Core Server
Component/s: Storage
Affects Version/s: 2.5.3
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Michael Grundy Assignee: Eric Milkie
Resolution: Done Votes: 0
Labels: 26qa
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
Operating System: Linux
Steps To Reproduce:

On SLES-11, 64bit, system with 32gig ram free and 800g XFS filesystem space, with 2.5.3 or latest run:

cluster-40:~ # bin/mongod --replSet cluster-40 --sslMode sslOnly --sslPEMKeyFile testServer10.pem --logpath /root/mongod-fail.log --dbpath /data/0 --fork

cluster-40:~ # bin/mongo admin --ssl --host $(hostname) --port 27017 --eval "printjson(rs.initiate())"

Participants:

 Description   

Start mongod, run rs.intiate(), fail:

cluster-40:~ # bin/mongod --replSet cluster-40 --sslMode sslOnly --sslPEMKeyFile testServer10.pem --logpath /root/mongod-fail.log --dbpath /data/0 --fork
about to fork child process, waiting until server is ready for connections.
forked process: 20922
child process started successfully, parent exiting
cluster-40:~ # bin/mongo admin --ssl --host $(hostname) --port 27017 --eval "printjson(rs.initiate())"
MongoDB shell version: 2.5.3
connecting to: cluster-40.knuckleboys.com:27017/admin
{
	"info2" : "no configuration explicitly specified -- making one",
	"me" : "cluster-40.knuckleboys.com:27017",
	"ok" : 0,
	"errmsg" : "couldn't initiate : file /data/0/local.7 open/create failed in createPrivateMap (look in log for more information)"
}

2013-11-12T19:10:14.977+0000 [conn1] replSet replSetInitiate admin command received from client
2013-11-12T19:10:14.978+0000 [conn1] replSet info initiate : no configuration specified.  Using a default configuration for the set
2013-11-12T19:10:14.978+0000 [conn1] replSet created this configuration for initiation : { _id: "cluster-40", members: [ { _id: 0, host: "cluster-40.knuckleboys.com:27017" } ] }
2013-11-12T19:10:14.978+0000 [conn1] replSet replSetInitiate config object parses ok, 1 members specified
2013-11-12T19:10:14.979+0000 [conn1] replSet replSetInitiate all members seem up
2013-11-12T19:10:14.979+0000 [conn1] ******
2013-11-12T19:10:14.979+0000 [conn1] creating replication oplog of size: 42970MB...
2013-11-12T19:10:14.980+0000 [FileAllocator] allocating new datafile /data/0/local.1, filling with zeroes...
2013-11-12T19:10:14.980+0000 [FileAllocator] done allocating datafile /data/0/local.1, size: 2047MB,  took 0 secs
2013-11-12T19:10:14.981+0000 [FileAllocator] allocating new datafile /data/0/local.2, filling with zeroes...
2013-11-12T19:10:14.981+0000 [FileAllocator] done allocating datafile /data/0/local.2, size: 2047MB,  took 0 secs
2013-11-12T19:10:14.982+0000 [FileAllocator] allocating new datafile /data/0/local.3, filling with zeroes...
2013-11-12T19:10:14.982+0000 [FileAllocator] done allocating datafile /data/0/local.3, size: 2047MB,  took 0 secs
2013-11-12T19:10:14.982+0000 [FileAllocator] allocating new datafile /data/0/local.4, filling with zeroes...
2013-11-12T19:10:14.983+0000 [FileAllocator] done allocating datafile /data/0/local.4, size: 2047MB,  took 0 secs
2013-11-12T19:10:14.983+0000 [FileAllocator] allocating new datafile /data/0/local.5, filling with zeroes...
2013-11-12T19:10:14.984+0000 [FileAllocator] done allocating datafile /data/0/local.5, size: 2047MB,  took 0 secs
2013-11-12T19:10:14.984+0000 [FileAllocator] allocating new datafile /data/0/local.6, filling with zeroes...
2013-11-12T19:10:14.985+0000 [FileAllocator] done allocating datafile /data/0/local.6, size: 2047MB,  took 0 secs
2013-11-12T19:10:14.985+0000 [FileAllocator] allocating new datafile /data/0/local.7, filling with zeroes...
2013-11-12T19:10:14.986+0000 [FileAllocator] done allocating datafile /data/0/local.7, size: 2047MB,  took 0 secs
2013-11-12T19:10:14.986+0000 [conn1] ERROR: mmap private failed with out of memory. (64 bit build)
2013-11-12T19:10:14.986+0000 [conn1] Assertion: 13636:file /data/0/local.7 open/create failed in createPrivateMap (look in log for more information)
2013-11-12T19:10:14.997+0000 [conn1] local.oplog.rs 0xf4e8f9 0xf0094a 0xee90b6 0xee91ec 0xda3c70 0xda48e6 0xda24fa 0xda9c33 0xdaa15f 0xdaa287 0xdb0361 0xbd6a91 0xbd7260 0xd3035e 0xd63886 0x9bfe00 0x9c1594 0x9c218f 0xbb97aa 0xbbff3f 
 bin/mongod(_ZN5mongo15printStackTraceERSo+0x29) [0xf4e8f9]
 bin/mongod(_ZN5mongo10logContextEPKc+0x1fa) [0xf0094a]
 bin/mongod(_ZN5mongo11msgassertedEiPKc+0xe6) [0xee90b6]
 bin/mongod() [0xee91ec]
 bin/mongod(_ZN5mongo17DurableMappedFile13finishOpeningEv+0x2f0) [0xda3c70]
 bin/mongod(_ZN5mongo17DurableMappedFile6createERKSsRyb+0xd6) [0xda48e6]
 bin/mongod(_ZN5mongo8DataFile4openEPKcib+0x19a) [0xda24fa]
 bin/mongod(_ZN5mongo13ExtentManager7getFileEiib+0x173) [0xda9c33]
 bin/mongod(_ZN5mongo13ExtentManager8addAFileEib+0x2f) [0xdaa15f]
 bin/mongod(_ZN5mongo13ExtentManager12createExtentEii+0xe7) [0xdaa287]
 bin/mongod(_ZN5mongo10Collection19increaseStorageSizeEib+0x4a1) [0xdb0361]
 bin/mongod(_ZN5mongo13_userCreateNSEPKcRKNS_7BSONObjERSsPb+0x4a1) [0xbd6a91]
 bin/mongod(_ZN5mongo12userCreateNSEPKcNS_7BSONObjERSsbPb+0x1b0) [0xbd7260]
 bin/mongod(_ZN5mongo11createOplogEv+0x7ae) [0xd3035e]
 bin/mongod(_ZN5mongo18CmdReplSetInitiate3runERKSsRNS_7BSONObjEiRSsRNS_14BSONObjBuilderEb+0x1256) [0xd63886]
 bin/mongod(_ZN5mongo12_execCommandEPNS_7CommandERKSsRNS_7BSONObjEiRSsRNS_14BSONObjBuilderEb+0x30) [0x9bfe00]
 bin/mongod(_ZN5mongo7Command11execCommandEPS0_RNS_6ClientEiPKcRNS_7BSONObjERNS_14BSONObjBuilderEb+0x9a4) [0x9c1594]
 bin/mongod(_ZN5mongo12_runCommandsEPKcRNS_7BSONObjERNS_11_BufBuilderINS_16TrivialAllocatorEEERNS_14BSONObjBuilderEbi+0x64f) [0x9c218f]
 bin/mongod(_ZN5mongo11runCommandsEPKcRNS_7BSONObjERNS_5CurOpERNS_11_BufBuilderINS_16TrivialAllocatorEEERNS_14BSONObjBuilderEbi+0x3a) [0xbb97aa]
 bin/mongod(_ZN5mongo8runQueryERNS_7MessageERNS_12QueryMessageERNS_5CurOpES1_+0x181f) [0xbbff3f]
2013-11-12T19:10:14.999+0000 [conn1] replSet replSetInitiate exception: file /data/0/local.7 open/create failed in createPrivateMap (look in log for more information)



 Comments   
Comment by Michael Grundy [ 12/Nov/13 ]

Yes, of course. That solves it. Probably a difference in how the heuristic overcommit checks works between 2.6 and 3.0 kernels

Comment by Eric Milkie [ 12/Nov/13 ]

It's trying to create an oplog of 42 gigs (because it uses a %age of the total disk if you don't pass it --oplogSize). It won't be able to map all of that in if you don't have enough RAM+swap for it without overcommit enabled.

Generated at Thu Feb 08 03:26:26 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.