[SERVER-17085] Support Veritas Backed Storage (vxfs) for MongoDB Created: 28/Jan/15  Updated: 06/Dec/22  Resolved: 15/Apr/19

Status: Closed
Project: Core Server
Component/s: Storage
Affects Version/s: 2.6.7, 3.0.0-rc6
Fix Version/s: None

Type: New Feature Priority: Major - P3
Reporter: Daniel Nauck Assignee: Backlog - Storage Execution Team
Resolution: Incomplete Votes: 1
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

SUSE Linux Enterprise Server 11 (x86_64)
VERSION = 11
PATCHLEVEL = 3

VRTSaslapm-6.1.0.000-GA_SLES11
VRTSob-3.4.678-0
VRTSobgui-3.4.30.0-0
VRTSvcs-6.1.0.000-GA_SLES11
VRTSvcsag-6.1.0.100-GA_SLES11
VRTSvlic-3.02.61.010-0
VRTSperl-5.16.1.6-SLES11
VRTSvxfen-6.1.0.000-GA_SLES11
VRTSgab-6.1.0.000-GA_SLES11
VRTSamf-6.1.0.000-GA_SLES11
VRTSvxvm-6.1.0.100-GA_SLES11
VRTSspt-6.1.0.000-GA
VRTSllt-6.1.0.200-GA_SLES11
VRTSvxfs-6.1.0.100-SLES11


Issue Links:
Related
Assigned Teams:
Storage Execution
Participants:
Case:

 Description   

(Original Title: map file alloc failed - local.startup_log Assertion failure a() != -1)

Hello,

i'm using MonfoDB now for 1,5 years in SUSE Linux 11 with Veritas Cluster without any problem.

Now i've the problem that the server does not start correctly because of "map file alloc failed" exceptions.

I've found the following forum post withoput any solution but with similar problems:
https://groups.google.com/d/topic/mongodb-user/UvxTz-GzWkg/discussion

The problem only occurs in vxfs filesystems on this system.

Log from MongoDB 2.6.7:

2015-01-28T10:27:07.736+0100 [DataFileSync] BackgroundJob starting: DataFileSync
2015-01-28T10:27:07.736+0100 shardKeyTest passed
2015-01-28T10:27:07.736+0100 isInRangeTest passed
2015-01-28T10:27:07.736+0100 shardObjTest passed
2015-01-28T10:27:07.738+0100 [initandlisten] MongoDB starting : pid=8011 port=27017 dbpath=/usd/as67154a/work/db/ 64-bit host=D100STUL0293
2015-01-28T10:27:07.739+0100 [initandlisten] 
2015-01-28T10:27:07.739+0100 [initandlisten] ** WARNING: You are running on a NUMA machine.
2015-01-28T10:27:07.739+0100 [initandlisten] **          We suggest launching mongod like this to avoid performance problems:
2015-01-28T10:27:07.739+0100 [initandlisten] **              numactl --interleave=all mongod [other options]
2015-01-28T10:27:07.739+0100 [initandlisten] 
2015-01-28T10:27:07.739+0100 [initandlisten] db version v2.6.7
2015-01-28T10:27:07.739+0100 [initandlisten] git version: a7d57ad27c382de82e9cb93bf983a80fd9ac9899
2015-01-28T10:27:07.739+0100 [initandlisten] build info: Linux build7.nj1.10gen.cc 2.6.32-431.3.1.el6.x86_64 #1 SMP Fri Jan 3 21:39:27 UTC 2014 x86_64 BOOST_LIB_VERSION=1_49
2015-01-28T10:27:07.739+0100 [initandlisten] allocator: tcmalloc
2015-01-28T10:27:07.739+0100 [initandlisten] options: { storage: { dbPath: "/usd/as67154a/work/db/" }, systemLog: { verbosity: 5 } }
2015-01-28T10:27:07.739+0100 [initandlisten] 
2015-01-28T10:27:07.739+0100 [initandlisten] ** WARNING: Readahead for /usd/as67154a/work/db/ is set to 512KB
2015-01-28T10:27:07.739+0100 [initandlisten] **          We suggest setting it to 256KB (512 sectors) or less
2015-01-28T10:27:07.739+0100 [initandlisten] **          http://dochub.mongodb.org/core/readahead
2015-01-28T10:27:07.741+0100 [initandlisten] flushing directory /usd/as67154a/work/db
2015-01-28T10:27:07.741+0100 [initandlisten] journal dir=/usd/as67154a/work/db/journal
2015-01-28T10:27:07.741+0100 [initandlisten] recover : no journal files present, no recovery needed
2015-01-28T10:27:07.742+0100 [initandlisten] flushing directory /usd/as67154a/work/db/journal
2015-01-28T10:27:07.824+0100 [initandlisten] flushing directory /usd/as67154a/work/db/journal
2015-01-28T10:27:07.868+0100 [initandlisten] flushing directory /usd/as67154a/work/db/journal
2015-01-28T10:27:07.872+0100 [initandlisten] opening db:  local
2015-01-28T10:27:07.872+0100 [initandlisten] enter repairDatabases (to check pdfile version #)
2015-01-28T10:27:07.872+0100 [initandlisten] done repairDatabases
2015-01-28T10:27:07.873+0100 [initandlisten] opening db:  admin
2015-01-28T10:27:07.882+0100 [initandlisten] Running query: query: {} sort: {} projection: {} skip: 0 limit: 0
2015-01-28T10:27:07.882+0100 [initandlisten] query admin.system.roles planSummary: EOF ntoreturn:0 ntoskip:0 nscanned:0 nscannedObjects:0 keyUpdates:0 numYields:0 locks(micros) W:100 r:9475 nreturned:0 reslen:20 9ms
2015-01-28T10:27:07.882+0100 [ClientCursorMonitor] BackgroundJob starting: ClientCursorMonitor
2015-01-28T10:27:07.882+0100 [PeriodicTaskRunner] BackgroundJob starting: PeriodicTaskRunner
2015-01-28T10:27:07.882+0100 [TTLMonitor] BackgroundJob starting: TTLMonitor
2015-01-28T10:27:07.882+0100 [initandlisten] Collection admin.system.version does not exist. Using EOF runner: query: { _id: "authSchema" } sort: {} projection: {} skip: 0 limit: 0
2015-01-28T10:27:07.883+0100 [initandlisten] Collection admin.system.users does not exist. Using EOF runner: query: {} sort: {} projection: {} skip: 0 limit: 0
2015-01-28T10:27:07.883+0100 [initandlisten] fd limit hard:131072 soft:131072 max conn: 104857
2015-01-28T10:27:07.883+0100 [IndexRebuilder] BackgroundJob starting: IndexRebuilder
2015-01-28T10:27:07.883+0100 [IndexRebuilder] checking complete
2015-01-28T10:27:07.883+0100 [initandlisten] run command local.$cmd { create: "startup_log", size: 10485760, capped: true }
2015-01-28T10:27:07.883+0100 [initandlisten] create collection local.startup_log { size: 10485760, capped: true }
2015-01-28T10:27:07.883+0100 [initandlisten] allocating new ns file /usd/as67154a/work/db/local.ns, filling with zeroes...
2015-01-28T10:27:07.965+0100 [initandlisten] mmf create /usd/as67154a/work/db/local.ns
2015-01-28T10:27:07.965+0100 [initandlisten] mmf finishOpening 0x7ffff0c9d000 /usd/as67154a/work/db/local.ns len:16777216
2015-01-28T10:27:07.996+0100 [initandlisten] Database::_addNamespaceToCatalog ns: local.startup_log
2015-01-28T10:27:08.040+0100 [initandlisten] Database::_addNamespaceToCatalog ns: local.system.namespaces
2015-01-28T10:27:08.042+0100 [initandlisten] mmf create /usd/as67154a/work/db/local.0
2015-01-28T10:27:08.042+0100 [FileAllocator] allocating new datafile /usd/as67154a/work/db/local.0, filling with zeroes...
2015-01-28T10:27:08.042+0100 [FileAllocator] creating directory /usd/as67154a/work/db/_tmp
2015-01-28T10:27:08.042+0100 [FileAllocator] flushing directory /usd/as67154a/work/db
2015-01-28T10:27:08.054+0100 [FileAllocator] flushing directory /usd/as67154a/work/db
2015-01-28T10:27:08.054+0100 [FileAllocator] done allocating datafile /usd/as67154a/work/db/local.0, size: 64MB,  took 0.003 secs
2015-01-28T10:27:08.054+0100 [initandlisten] User Assertion: 10447:map file alloc failed, wanted: 67108864 filelen: 0 8
2015-01-28T10:27:08.054+0100 [initandlisten] mmf close /usd/as67154a/work/db/local.0
2015-01-28T10:27:08.054+0100 [initandlisten] command local.$cmd command: create { create: "startup_log", size: 10485760, capped: true } ntoreturn:1 keyUpdates:0 numYields:0  reslen:123 171ms
2015-01-28T10:27:08.055+0100 [initandlisten] info PageFaultRetryableSection will not yield, already locked upon reaching
2015-01-28T10:27:08.055+0100 [initandlisten] local.startup_log Assertion failure a() != -1 src/mongo/db/storage/record.cpp 538
2015-01-28T10:27:08.068+0100 [initandlisten] local.startup_log 0x11fd1b1 0x119efa9 0x11832be 0xf00f1d 0xf2dc55 0xf3347a 0xf334df 0xf3d2ff 0xf3cb20 0x8b8bcf 0x8bac2a 0xb9cdf0 0xb9e2de 0xba4050 0xba4a5f 0x7c2921 0x767c12 0x76b81f 0x76e0a9 0x76eb6b 
 ./mongodb/bin/mongod(_ZN5mongo15printStackTraceERSo+0x21) [0x11fd1b1]
 ./mongodb/bin/mongod(_ZN5mongo10logContextEPKc+0x159) [0x119efa9]
 ./mongodb/bin/mongod(_ZN5mongo12verifyFailedEPKcS1_j+0x17e) [0x11832be]
 ./mongodb/bin/mongod(_ZNK5mongo7DiskLoc3extEv+0x5d) [0xf00f1d]
 ./mongodb/bin/mongod(_ZN5mongo16NamespaceDetails11cappedAllocEPNS_10CollectionERKNS_10StringDataEi+0x35) [0xf2dc55]
 ./mongodb/bin/mongod(_ZN5mongo16NamespaceDetails6_allocEPNS_10CollectionERKNS_10StringDataEi+0x2a) [0xf3347a]
 ./mongodb/bin/mongod(_ZN5mongo16NamespaceDetails5allocEPNS_10CollectionERKNS_10StringDataEi+0x5f) [0xf334df]
 ./mongodb/bin/mongod(_ZN5mongo19CappedRecordStoreV111allocRecordEii+0x3f) [0xf3d2ff]
 ./mongodb/bin/mongod(_ZN5mongo17RecordStoreV1Base12insertRecordEPKcii+0x60) [0xf3cb20]
 ./mongodb/bin/mongod(_ZN5mongo10Collection15_insertDocumentERKNS_7BSONObjEbPKNS_16PregeneratedKeysE+0x7f) [0x8b8bcf]
 ./mongodb/bin/mongod(_ZN5mongo10Collection14insertDocumentERKNS_7BSONObjEbPKNS_16PregeneratedKeysE+0x22a) [0x8bac2a]
 ./mongodb/bin/mongod(_ZN5mongo14checkAndInsertERNS_6Client7ContextEPKcRNS_7BSONObjEPNS_16PregeneratedKeysE+0x270) [0xb9cdf0]
 ./mongodb/bin/mongod(_ZN5mongo14receivedInsertERNS_7MessageERNS_5CurOpE+0x95e) [0xb9e2de]
 ./mongodb/bin/mongod(_ZN5mongo16assembleResponseERNS_7MessageERNS_10DbResponseERKNS_11HostAndPortE+0xe50) [0xba4050]
 ./mongodb/bin/mongod(_ZN5mongo14DBDirectClient3sayERNS_7MessageEbPSs+0xaf) [0xba4a5f]
 ./mongodb/bin/mongod(_ZN5mongo12DBClientBase6insertERKSsNS_7BSONObjEi+0x251) [0x7c2921]
 ./mongodb/bin/mongod(_ZN5mongo10logStartupEv+0x882) [0x767c12]
 ./mongodb/bin/mongod(_ZN5mongo6listenEi+0xbf) [0x76b81f]
 ./mongodb/bin/mongod(_ZN5mongo14_initAndListenEi+0x7a9) [0x76e0a9]
 ./mongodb/bin/mongod(_ZN5mongo13initAndListenEi+0x1b) [0x76eb6b]
2015-01-28T10:27:08.069+0100 [initandlisten] insert local.startup_log keyUpdates:0 exception: assertion src/mongo/db/storage/record.cpp:538 numYields:0  14ms
2015-01-28T10:27:08.069+0100 [initandlisten] waiting for connections on port 27017
2015-01-28T10:27:08.083+0100 [journal] journal WRITETODATAFILES 1
2015-01-28T10:27:08.083+0100 [journal] journal WRITETODATAFILES 2
2015-01-28T10:27:08.083+0100 [journal] journal WRITETODATAFILES 0.125ms
2015-01-28T10:27:09.004+0100 [journal] _groupCommit 
2015-01-28T10:27:09.004+0100 [journal] _groupCommit upgrade
2015-01-28T10:27:09.004+0100 [journal] journal REMAPPRIVATEVIEW
2015-01-28T10:27:09.005+0100 [journal] journal REMAPPRIVATEVIEW done startedAt: 0 n:2 0ms
2015-01-28T10:27:09.005+0100 [journal] groupCommit end
2015-01-28T10:27:09.432+0100 [initandlisten] connection accepted from 4.239.93.11:32813 #1 (1 connection now open)

Log from MongoDB 3.0.0-rc6:

2015-01-28T10:35:34.009+0100 I CONTROL  [initandlisten] MongoDB starting : pid=18888 port=27017 dbpath=/usd/as67154a/work/db/ 64-bit host=D100STUL0293
2015-01-28T10:35:34.009+0100 I CONTROL  [initandlisten] 
2015-01-28T10:35:34.009+0100 I CONTROL  [initandlisten] ** WARNING: You are running on a NUMA machine.
2015-01-28T10:35:34.009+0100 I CONTROL  [initandlisten] **          We suggest launching mongod like this to avoid performance problems:
2015-01-28T10:35:34.009+0100 I CONTROL  [initandlisten] **              numactl --interleave=all mongod [other options]
2015-01-28T10:35:34.009+0100 I CONTROL  [initandlisten] 
2015-01-28T10:35:34.009+0100 I CONTROL  [initandlisten] ** WARNING: /sys/kernel/mm/transparent_hugepage/defrag is 'always'.
2015-01-28T10:35:34.009+0100 I CONTROL  [initandlisten] **        We suggest setting it to 'never'
2015-01-28T10:35:34.009+0100 I CONTROL  [initandlisten] 
2015-01-28T10:35:34.009+0100 I CONTROL  [initandlisten] db version v3.0.0-rc6
2015-01-28T10:35:34.009+0100 I CONTROL  [initandlisten] git version: 8dd41ae58040dfa09c633c05ce5752d46402f9c5
2015-01-28T10:35:34.009+0100 I CONTROL  [initandlisten] build info: Linux build18.nj1.10gen.cc 2.6.32-431.3.1.el6.x86_64 #1 SMP Fri Jan 3 21:39:27 UTC 2014 x86_64 BOOST_LIB_VERSION=1_49
2015-01-28T10:35:34.009+0100 I CONTROL  [initandlisten] allocator: tcmalloc
2015-01-28T10:35:34.009+0100 I CONTROL  [initandlisten] options: { storage: { dbPath: "/usd/as67154a/work/db/" } }
2015-01-28T10:35:34.020+0100 I STORAGE  [initandlisten] 
2015-01-28T10:35:34.020+0100 I STORAGE  [initandlisten] ** WARNING: Readahead for /usd/as67154a/work/db/ is set to 512KB
2015-01-28T10:35:34.020+0100 I STORAGE  [initandlisten] **          We suggest setting it to 256KB (512 sectors) or less
2015-01-28T10:35:34.020+0100 I STORAGE  [initandlisten] **          http://dochub.mongodb.org/core/readahead
2015-01-28T10:35:34.020+0100 I JOURNAL  [initandlisten] journal dir=/usd/as67154a/work/db/journal
2015-01-28T10:35:34.021+0100 I JOURNAL  [initandlisten] recover : no journal files present, no recovery needed
2015-01-28T10:35:34.141+0100 I JOURNAL  [durability] Durability thread started
2015-01-28T10:35:34.142+0100 I JOURNAL  [journal writer] Journal writer thread started
2015-01-28T10:35:34.145+0100 I INDEX    [initandlisten] allocating new ns file /usd/as67154a/work/db/local.ns, filling with zeroes...
2015-01-28T10:35:34.232+0100 I STORAGE  [FileAllocator] allocating new datafile /usd/as67154a/work/db/local.0, filling with zeroes...
2015-01-28T10:35:34.232+0100 I STORAGE  [FileAllocator] creating directory /usd/as67154a/work/db/_tmp
2015-01-28T10:35:34.235+0100 I STORAGE  [FileAllocator] done allocating datafile /usd/as67154a/work/db/local.0, size: 64MB,  took 0.001 secs
2015-01-28T10:35:34.236+0100 W STORAGE  [initandlisten] database /usd/as67154a/work/db/ local could not be opened due to DBException 10447: map file alloc failed, wanted: 67108864 filelen: 0 8
2015-01-28T10:35:34.236+0100 I STORAGE  [initandlisten] exception in initAndListen: 10447 map file alloc failed, wanted: 67108864 filelen: 0 8, terminating
2015-01-28T10:35:34.236+0100 I CONTROL  [initandlisten] now exiting
2015-01-28T10:35:34.236+0100 I NETWORK  [initandlisten] shutdown: going to close listening sockets...
2015-01-28T10:35:34.236+0100 I NETWORK  [initandlisten] removing socket file: /tmp/mongodb-27017.sock
2015-01-28T10:35:34.236+0100 I NETWORK  [initandlisten] shutdown: going to flush diaglog...
2015-01-28T10:35:34.236+0100 I NETWORK  [initandlisten] shutdown: going to close sockets...
2015-01-28T10:35:34.236+0100 I STORAGE  [initandlisten] shutdown: waiting for fs preallocator...
2015-01-28T10:35:34.236+0100 I STORAGE  [initandlisten] shutdown: final commit...
2015-01-28T10:35:34.236+0100 I JOURNAL  [initandlisten] Terminating durability thread ...
2015-01-28T10:35:34.339+0100 I JOURNAL  [journal writer] Journal writer thread stopped
2015-01-28T10:35:34.339+0100 I JOURNAL  [durability] Durability thread stopped
2015-01-28T10:35:34.339+0100 I STORAGE  [initandlisten] shutdown: closing all files...
2015-01-28T10:35:34.339+0100 I STORAGE  [initandlisten] closeAllFiles() finished
2015-01-28T10:35:34.339+0100 I JOURNAL  [initandlisten] journalCleanup...
2015-01-28T10:35:34.339+0100 I JOURNAL  [initandlisten] removeJournalFiles
2015-01-28T10:35:34.341+0100 I STORAGE  [initandlisten] shutdown: removing fs lock...
2015-01-28T10:35:34.341+0100 I CONTROL  [initandlisten] dbexit:  rc: 100



 Comments   
Comment by Eric Milkie [ 15/Apr/19 ]

The errors originally described in this ticket were from the mmapv1 storage engine. We do not believe there are problems running vxfs on WiredTiger today, so I will resolve this ticket.

Comment by Fran Rodríguez [ 06/Apr/15 ]

@samkleinman sorry for the delay, i fix the problem, i formatted the partitions again, and the problem seems disappeared. The problem came out when i added new volumens to the sharding.

Comment by Daniel Nauck [ 02/Apr/15 ]

@samk: please do so

Comment by Sam Kleinman (Inactive) [ 18/Mar/15 ]

dnauck Thanks for the information. Because we don't test MongoDB with Veritas and don't have access to a test environment it's difficult for us to reproduce this case. We can add a feature request to support Vertias-backed storage (or convert this issue to that purpose,) if you're interested.

It may also be worth while to test and see if the WiredTiger storage engine is more compatible with Veritas with MongoDB 3.0.1.

frodriguez, we think that the issue you're seeing is separate from the veritas compatibility issue that this ticket addresses. If you're still having problems, I think it makes sense to separate that case into a second issue that we can handle in parallel.

Cheers,
sam

Comment by Daniel Nauck [ 18/Mar/15 ]

Following the requested information:

1.) the dbpath is an empty directory; but this problem also occur when a new data file is created in an existing database
2.) sure, 5 GB of 100 GB in use; --smallfiles had also no effect
3.) yes, tested this too

Comment by Sam Kleinman (Inactive) [ 16/Mar/15 ]

Sorry about this issue, I just wanted to check in to see if this issue is still unresolved. I think we need additional information to continue to nail down the underlying cause.

  • First, does the dbpath configured already have data? Can you provide the output of ls -laR for your datapath? If you do have an existing data set, can you verify that you can start a mongod instance on new file system volume.
  • Can you verify that there is enough free space in the volume to support allocation of new database files. MongoDB always allocates a local database, so you must have at least enough space for this file in addition to space for the journal. Can you reproduce this problem when running with --smallfiles
  • Can you ensure that the underlying storage layer can create files. Use a command such as: dd if=/dev/zero of=2g.img bs=4096 count=0 seek=512M to verify that you can create at least one 2 gigabyte file.

frodriguez, could you provide the same verification? We want to make sure that there's no underlying effect of the network storage system, in the general case. We couldn't reproduce your case using our own systems, but getting more information about your own deployment (i.e. in the form of these questions. and the specific logging messages that you're seeing) that might help us to identify the cause of the behavior you're seeing.

Regards,
sam

Comment by Fran Rodríguez [ 10/Mar/15 ]

Hello,

I have the same problem with mongodb 2.6.5 on Debian7 but my dbs are mounted on ext4 SSD hard drives. My mount options are this:

UUID=2372f3a3-322e-4d7f-9414-d09404690195 /var/lib/mongodb/ad3 ext4 defaults,noatime,nodiratime,discard 0 0

Cheers

Comment by Daniel Nauck [ 24/Feb/15 ]

Hello,

we mount the VxFS with the following options: rw,delaylog,largefiles,ioerror=mwdisable

Comment by Ramon Fernandez Marina [ 24/Feb/15 ]

The failure is around a lseek() call:

unsigned long long filelen = lseek(fd, 0, SEEK_END);
uassert(10447,  str::stream() << "map file alloc failed, wanted: " << length << " filelen: " << filelen << ' ' << sizeof(size_t), filelen == length );

Looks like 2.6 keeps going, but in 3.0 this assertion triggers an ordered shutdown. What options are you using to mount your VxFS filesystem?

Generated at Thu Feb 08 03:43:15 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.