[SERVER-3404] MongoDB doesn't start on either zfs-fuse or zfs on linux. Created: 10/Jul/11  Updated: 29/Aug/11  Resolved: 04/Aug/11

Status: Closed
Project: Core Server
Component/s: Storage, Usability
Affects Version/s: 1.8.2
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Mage Assignee: Mathias Stearn
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Gentoo amd64, desktop profile, Linux version 2.6.38-gentoo-r6 (root@vylie) (gcc version 4.4.5 (Gentoo 4.4.5 p1.2, pie-0.4.5) ) #4 SMP Thu Jun 2 11:09:40 CEST 2011, native zfs from github (fresh build)
Gentoo amd64 hardened profile, Linux version 2.6.38-hardened-r6 (root@xena) (gcc version 4.4.5 (Gentoo Hardened 4.4.5 p1.2, pie-0.4.5) ) #3 SMP Tue Jun 21 10:32:26 CEST 2011, zfs-fuse 0.6.9


Issue Links:
Duplicate
duplicates SERVER-2906 Durability should work on filesystems... Closed
Operating System: Linux
Participants:

 Description   

zfs-fuse:

Sun Jul 10 23:38:34 [initandlisten] MongoDB starting : pid=14609
port=27017 dbpath=/var/lib9/mongodb 64-bit
Sun Jul 10 23:38:34 [initandlisten] db version v1.8.2, pdfile version
4.5
Sun Jul 10 23:38:34 [initandlisten] git version: nogitversion
Sun Jul 10 23:38:34 [initandlisten] build sys info: Linux xena 2.6.38-
hardened-r6 #3 SMP Tue Jun 21 10:32:26 CEST 2011 x86_64
BOOST_LIB_VERSION=1_42
Sun Jul 10 23:38:34 [initandlisten] journal dir=/var/lib9/mongodb/
journal
Sun Jul 10 23:38:34 [initandlisten] recover : no journal files
present, no recovery needed
Sun Jul 10 23:38:34 [initandlisten] info preallocateIsFaster couldn't
run; returning false
Sun Jul 10 23:38:34 [initandlisten] exception in initAndListen
std::exception: couldn't open file /var/lib9/mongodb/journal/j._0 for
writing errno:22 Invalid argument, terminating
Sun Jul 10 23:38:34 dbexit:
Sun Jul 10 23:38:34 [initandlisten] shutdown: going to close listening
sockets...
Sun Jul 10 23:38:34 [initandlisten] shutdown: going to flush
diaglog...
Sun Jul 10 23:38:34 [initandlisten] shutdown: going to close
sockets...
Sun Jul 10 23:38:34 [initandlisten] shutdown: waiting for fs
preallocator...
Sun Jul 10 23:38:34 [initandlisten] shutdown: lock for final commit...
Sun Jul 10 23:38:34 [initandlisten] shutdown: final commit...
Sun Jul 10 23:38:34 [initandlisten] shutdown: closing all files...
Sun Jul 10 23:38:34 closeAllFiles() finished
Sun Jul 10 23:38:34 [initandlisten] shutdown: journalCleanup...
Sun Jul 10 23:38:34 [initandlisten] removeJournalFiles
Sun Jul 10 23:38:34 [initandlisten] shutdown: removing fs lock...
Sun Jul 10 23:38:34 dbexit: really exiting now

zfs:
Sun Jul 10 23:53:04 [initandlisten] MongoDB starting : pid=22362
port=27017 dbpath=/home/mage/temp/mongodb 64-bit
Sun Jul 10 23:53:04 [initandlisten] db version v1.8.2, pdfile version
4.5
Sun Jul 10 23:53:04 [initandlisten] git version: nogitversion
Sun Jul 10 23:53:04 [initandlisten] build sys info: Linux vylie 2.6.38-
gentoo-r6 #4 SMP Thu Jun 2 11:09:40 CEST 2011 x86_64
BOOST_LIB_VERSION=1_42
Sun Jul 10 23:53:04 [initandlisten] journal dir=/home/mage/temp/
mongodb/journal
Sun Jul 10 23:53:04 [initandlisten] recover : no journal files
present, no recovery needed
Sun Jul 10 23:53:04 [initandlisten] info preallocateIsFaster couldn't
run; returning false
Sun Jul 10 23:53:04 [initandlisten] exception in initAndListen
std::exception: couldn't open file /home/mage/temp/mongodb/journal/
j._0 for writing errno:22 Invalid argument, terminating
Sun Jul 10 23:53:04 dbexit:
Sun Jul 10 23:53:04 [initandlisten] shutdown: going to close listening
sockets...
Sun Jul 10 23:53:04 [initandlisten] shutdown: going to flush
diaglog...
Sun Jul 10 23:53:04 [initandlisten] shutdown: going to close
sockets...
Sun Jul 10 23:53:04 [initandlisten] shutdown: waiting for fs
preallocator...
Sun Jul 10 23:53:04 [initandlisten] shutdown: lock for final commit...
Sun Jul 10 23:53:04 [initandlisten] shutdown: final commit...
Sun Jul 10 23:53:04 [initandlisten] shutdown: closing all files...
Sun Jul 10 23:53:04 closeAllFiles() finished
Sun Jul 10 23:53:04 [initandlisten] shutdown: journalCleanup...
Sun Jul 10 23:53:04 [initandlisten] removeJournalFiles
Sun Jul 10 23:53:04 [initandlisten] shutdown: removing fs lock...
Sun Jul 10 23:53:04 dbexit: really exiting now

**************

These are two different computers, different kernels and different zfs forks (for linux).

The only common thing is the zfs and the errno:22



 Comments   
Comment by Mathias Stearn [ 04/Aug/11 ]

I'm going to close this as a dup of the DIRECT_IO case since it seems to work correctly in 1.9.1

Comment by Mage [ 29/Jul/11 ]

Sure. I downloaded the latest build from: http://downloads.mongodb.org/linux/mongodb-linux-x86_64-latest.tgz

/home/mage is native zfs
/tmp is xfs

Command lines:

mongod --bind_ip 192.168.1.100 --port 27018 --dbpath /home/mage/temp/mongodata/ --unixSocketPrefix /var/run/mongodb --logappend --logpath /home/mage/temp/mongodata/mongodb.log --journal -vv
mongod --bind_ip 192.168.1.100 --port 27018 --dbpath /tmp/mongodata/ --unixSocketPrefix /var/run/mongodb --logappend --logpath /home/mage/temp/mongodata/mongodb.log --journal -vv
./mongod --bind_ip 192.168.1.100 --port 27018 --dbpath /home/mage/temp/mongodata/ --unixSocketPrefix /var/run/mongodb --logappend --logpath /home/mage/temp/mongodata/mongodb.log --journal -vv

The third one is version 1.9.1

1.8.2 zfs:

Fri Jul 29 18:23:11 BackgroundJob starting: DataFileSync
Fri Jul 29 18:23:11 isInRangeTest passed
Fri Jul 29 18:23:11 shardKeyTest passed
Fri Jul 29 18:23:11 shardObjTest passed
Fri Jul 29 18:23:11 must specify both start and end of balancing window:

{ start: 1 }


Fri Jul 29 18:23:11 must specify both start and end of balancing window:

{ stop: 1 }


Fri Jul 29 18:23:11 cannot parse active window (use hh:mm 24hs format):

{ start: "21:30", stop: "28:35" }


Fri Jul 29 18:23:11 BalancingWidowObjTest passed
Fri Jul 29 18:23:11 versionCmpTest passed
Fri Jul 29 18:23:11 [initandlisten] MongoDB starting : pid=11890 port=27018 dbpath=/home/mage/temp/mongodata/ 64-bit
Fri Jul 29 18:23:11 [initandlisten] db version v1.8.2, pdfile version 4.5
Fri Jul 29 18:23:11 [initandlisten] git version: nogitversion
Fri Jul 29 18:23:11 [initandlisten] build sys info: Linux vylie 2.6.38-gentoo-r6 #4 SMP Thu Jun 2 11:09:40 CEST 2011 x86_64 BOOST_LIB_VERSION=1_42
Fri Jul 29 18:23:11 [initandlisten] journal dir=/home/mage/temp/mongodata/journal
Fri Jul 29 18:23:11 [initandlisten] recover : no journal files present, no recovery needed
Fri Jul 29 18:23:11 [initandlisten] User Assertion: 13516:couldn't open file /home/mage/temp/mongodata/journal/tempLatencyTest for writing errno:22 Invalid argument
Fri Jul 29 18:23:11 [initandlisten] info preallocateIsFaster couldn't run; returning false
Fri Jul 29 18:23:11 [initandlisten] User Assertion: 13516:couldn't open file /home/mage/temp/mongodata/journal/j._0 for writing errno:22 Invalid argument
Fri Jul 29 18:23:11 [initandlisten] exception in initAndListen std::exception: couldn't open file /home/mage/temp/mongodata/journal/j._0 for writing errno:22 Invalid argument, terminating
Fri Jul 29 18:23:11 dbexit:
Fri Jul 29 18:23:11 [initandlisten] shutdown: going to close listening sockets...
Fri Jul 29 18:23:11 [initandlisten] shutdown: going to flush diaglog...
Fri Jul 29 18:23:11 [initandlisten] shutdown: going to close sockets...
Fri Jul 29 18:23:11 [initandlisten] shutdown: waiting for fs preallocator...
Fri Jul 29 18:23:11 [initandlisten] shutdown: lock for final commit...
Fri Jul 29 18:23:11 [initandlisten] shutdown: final commit...
Fri Jul 29 18:23:11 [initandlisten] shutdown: closing all files...
Fri Jul 29 18:23:11 closeAllFiles() finished
Fri Jul 29 18:23:11 [initandlisten] shutdown: journalCleanup...
Fri Jul 29 18:23:11 [initandlisten] removeJournalFiles
Fri Jul 29 18:23:11 [initandlisten] removeJournalFiles end
Fri Jul 29 18:23:11 [initandlisten] shutdown: removing fs lock...
Fri Jul 29 18:23:11 dbexit: really exiting now

1.8.2 xfs:

Fri Jul 29 18:24:41 BackgroundJob starting: DataFileSync
Fri Jul 29 18:24:41 isInRangeTest passed
Fri Jul 29 18:24:41 shardKeyTest passed
Fri Jul 29 18:24:41 shardObjTest passed
Fri Jul 29 18:24:41 must specify both start and end of balancing window:

{ start: 1 }

Fri Jul 29 18:24:41 must specify both start and end of balancing window:

{ stop: 1 }

Fri Jul 29 18:24:41 cannot parse active window (use hh:mm 24hs format):

{ start: "21:30", stop: "28:35" }

Fri Jul 29 18:24:41 BalancingWidowObjTest passed
Fri Jul 29 18:24:41 versionCmpTest passed
Fri Jul 29 18:24:41 [initandlisten] MongoDB starting : pid=11954 port=27018 dbpath=/tmp/mongodata/ 64-bit
Fri Jul 29 18:24:41 [initandlisten] db version v1.8.2, pdfile version 4.5
Fri Jul 29 18:24:41 [initandlisten] git version: nogitversion
Fri Jul 29 18:24:41 [initandlisten] build sys info: Linux vylie 2.6.38-gentoo-r6 #4 SMP Thu Jun 2 11:09:40 CEST 2011 x86_64 BOOST_LIB_VERSION=1_42
Fri Jul 29 18:24:41 [initandlisten] journal dir=/tmp/mongodata/journal
Fri Jul 29 18:24:41 [initandlisten] recover : no journal files present, no recovery needed
Fri Jul 29 18:24:41 [initandlisten] query: local.system.namespaces

{ name: /^local.temp./ }

Fri Jul 29 18:24:41 [initandlisten] Accessing: local for the first time
Fri Jul 29 18:24:41 [initandlisten] query local.system.namespaces reslen:36 nreturned:0 0ms
Fri Jul 29 18:24:41 [initandlisten] enter repairDatabases (to check pdfile version #)
Fri Jul 29 18:24:41 [initandlisten] done repairDatabases
Fri Jul 29 18:24:41 [initandlisten] waiting for connections on port 27018
Fri Jul 29 18:24:41 BackgroundJob starting: snapshot
Fri Jul 29 18:24:41 [initandlisten] fd limit hard:1024 soft:1024 max conn: 819
Fri Jul 29 18:24:41 [websvr] web admin interface listening on port 28018
Fri Jul 29 18:24:41 [websvr] fd limit hard:1024 soft:1024 max conn: 819
Fri Jul 29 18:24:41 BackgroundJob starting: ClientCursorMonitor

1.9.1 nightly zfs:

Fri Jul 29 18:31:50 isInRangeTest passed
Fri Jul 29 18:31:50 shardKeyTest passed
Fri Jul 29 18:31:50 shardObjTest passed
Fri Jul 29 18:31:50 BackgroundJob starting: DataFileSync
Fri Jul 29 18:31:50 versionCmpTest passed
Fri Jul 29 18:31:50 versionArrayTest passed
Fri Jul 29 18:31:50 [initandlisten] MongoDB starting : pid=12210 port=27018 dbpath=/home/mage/temp/mongodata/ 64-bit host=vylie

    • NOTE: This is a development version (1.9.2-pre-) of MongoDB.
    • Not recommended for production.

Fri Jul 29 18:31:50 [initandlisten] db version v1.9.2-pre-, pdfile version 4.5
Fri Jul 29 18:31:50 [initandlisten] git version: e78187243d36bfd7d1372f762e67bf71cb962c25
Fri Jul 29 18:31:50 [initandlisten] build info: Linux bs-linux64.10gen.cc 2.6.21.7-2.ec2.v1.2.fc8xen #1 SMP Fri Nov 20 17:48:28 EST 2009 x86_64 BOOST_LIB_VERSION=1_41
Fri Jul 29 18:31:50 [initandlisten] options:

{ bind_ip: "192.168.1.100", dbpath: "/home/mage/temp/mongodata/", journal: true, logappend: true, logpath: "/home/mage/temp/mongodata/mongodb.log", port: 27018, unixSocketPrefix: "/var/run/mongodb", vv: true }

Fri Jul 29 18:31:50 [initandlisten] flushing directory /home/mage/temp/mongodata
Fri Jul 29 18:31:50 [initandlisten] journal dir=/home/mage/temp/mongodata/journal
Fri Jul 29 18:31:50 [initandlisten] recover : no journal files present, no recovery needed
Fri Jul 29 18:31:50 [initandlisten] flushing directory /home/mage/temp/mongodata/journal
Fri Jul 29 18:31:50 [initandlisten] flushing directory /home/mage/temp/mongodata/journal
Fri Jul 29 18:31:50 [initandlisten] flushing directory /home/mage/temp/mongodata/journal
Fri Jul 29 18:31:50 [initandlisten] runQuery called local.system.namespaces

{ name: /^local.temp./ }

Fri Jul 29 18:31:50 [initandlisten] Accessing: local for the first time
Fri Jul 29 18:31:50 [initandlisten] query local.system.namespaces reslen:20 1ms
Fri Jul 29 18:31:50 [initandlisten] enter repairDatabases (to check pdfile version #)
Fri Jul 29 18:31:50 [initandlisten] done repairDatabases
Fri Jul 29 18:31:50 BackgroundJob starting: snapshot
Fri Jul 29 18:31:50 [initandlisten] fd limit hard:1024 soft:1024 max conn: 819
Fri Jul 29 18:31:50 [websvr] fd limit hard:1024 soft:1024 max conn: 819
Fri Jul 29 18:31:50 [initandlisten] waiting for connections on port 27018
Fri Jul 29 18:31:50 [websvr] admin web console waiting for connections on port 28018
Fri Jul 29 18:31:50 BackgroundJob starting: ClientCursorMonitor
Fri Jul 29 18:31:50 BackgroundJob starting: PeriodicTask::Runner
Fri Jul 29 18:32:50 [DataFileSync] flushing mmap took 0ms for 1 files
Fri Jul 29 18:32:50 [journal] lsn set 59090
Fri Jul 29 18:32:50 [PeriodicTask::Runner] task: WriteBackManager::cleaner took: 0ms
Fri Jul 29 18:32:50 [PeriodicTask::Runner] task: DBConnectionPool-cleaner took: 0ms
Fri Jul 29 18:32:50 [PeriodicTask::Runner] task: DBConnectionPool-cleaner took: 0ms
Fri Jul 29 18:32:50 [clientcursormon] mem (MB) res:15 virt:164 mapped:0
Fri Jul 29 18:33:12 [initandlisten] connection accepted from 192.168.1.100:56071 #1
Fri Jul 29 18:33:12 [conn2] runQuery called admin.$cmd

{ whatsmyuri: 1 }

Fri Jul 29 18:33:12 [conn2] run command admin.$cmd

{ whatsmyuri: 1 }

Fri Jul 29 18:33:12 [conn2] command admin.$cmd command:

{ whatsmyuri: 1 }

ntoreturn:1 reslen:66 0ms
Fri Jul 29 18:33:12 [conn2] runQuery called admin.$cmd

{ replSetGetStatus: 1, forShell: 1 }

Fri Jul 29 18:33:12 [conn2] run command admin.$cmd

{ replSetGetStatus: 1, forShell: 1 }

Fri Jul 29 18:33:12 [conn2] command:

{ replSetGetStatus: 1, forShell: 1 }

Fri Jul 29 18:33:12 [conn2] command admin.$cmd command:

{ replSetGetStatus: 1, forShell: 1 }

ntoreturn:1 reslen:76 0ms
Fri Jul 29 18:33:50 [DataFileSync] flushing mmap took 0ms for 1 files
Fri Jul 29 18:33:50 [journal] lsn set 118549
Fri Jul 29 18:33:50 [PeriodicTask::Runner] task: WriteBackManager::cleaner took: 0ms
Fri Jul 29 18:33:50 [PeriodicTask::Runner] task: DBConnectionPool-cleaner took: 0ms
Fri Jul 29 18:33:50 [PeriodicTask::Runner] task: DBConnectionPool-cleaner took: 0ms
Fri Jul 29 18:34:47 [conn2] SocketException: remote: 192.168.1.100:56071 error: 9001 socket exception [0] server [192.168.1.100:56071]
Fri Jul 29 18:34:47 [conn2] end connection 192.168.1.100:56071
Fri Jul 29 18:34:50 [DataFileSync] flushing mmap took 1ms for 1 files
Fri Jul 29 18:34:50 [journal] lsn set 178009
Fri Jul 29 18:34:50 [PeriodicTask::Runner] task: WriteBackManager::cleaner took: 0ms
Fri Jul 29 18:34:50 [PeriodicTask::Runner] task: DBConnectionPool-cleaner took: 0ms
Fri Jul 29 18:34:50 [PeriodicTask::Runner] task: DBConnectionPool-cleaner took: 0ms

$ mongo --host 192.168.1.100 --port 27018
MongoDB shell version: 1.8.2
connecting to: 192.168.1.100:27018/test

Seems fine for first look.

The /home/mage/temp/mongodata just had been created, so it was empty before running mongod. I also re-created it before running 1.9.1.

I will try using the nightly version on zfs from now and report any issue I find.

Comment by Mathias Stearn [ 29/Jul/11 ]

Could you test with 1.9.1? It has some changes to make it work on filesystems that don't support direct I/O

Comment by Mage [ 11/Jul/11 ]

It works without --journal flag (tested on the zfs computer). After startup without journal it recovered the replication data to the new folder successfully.

Comment by Eliot Horowitz (Inactive) [ 10/Jul/11 ]

That will happen depending on direct io support.
Will look into it.

Comment by Mage [ 10/Jul/11 ]

Of course iit runs fine on XFS on both computers.

Generated at Thu Feb 08 03:02:58 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.