[SERVER-3749] Replication set failed Created: 01/Sep/11  Updated: 11/Jul/16  Resolved: 02/Sep/11

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: 1.8.2
Fix Version/s: None

Type: Question Priority: Major - P3
Reporter: Leonid Mikityanskiy Assignee: Kristina Chodorow (Inactive)
Resolution: Done Votes: 0
Labels: replication
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Linux 2.6.18-238.12.1.el5 x86_64


Attachments: Text File etc_master_mongod.conf.txt     Text File etc_slave_mongod.conf.txt     GZip Archive master_mongod.log.gz     GZip Archive slave_mongod.log.gz    
Participants:

 Description   

I try to setup Mongodb replication
For this I did
On master(sd2qvq10vl.saksdirect.com) :
~~
master = true
source = sd2qvq09vl.saksdirect.com
~~
On slave (sd2qvq09vl.saksdirect.com) :
~~
slave = true
source = sd2qvq10vl.saksdirect.com
~~
After that start of all dbs failed with:
~~
[mongodb@SD2QVQ09VL ~]$ sudo /etc/init.d/mongod start
Starting mongod: all output going to: /var/log/mongo/mongod.log
forked process: 29626
[ OK ]
~~
[mongodb@SD2QVQ10VL ~]$ sudo /etc/init.d/mongod start
Starting mongod: all output going to: /var/log/mongo/mongod.log
forked process: 31616
[ OK ]
~~
I commented out all changes in config file deleted 'mongod.lock' file and ran
~~~
[mongodb@SD2QVQ09VL ~]$ mongod --repair
Thu Sep 1 14:22:11 [initandlisten] MongoDB starting : pid=29441 port=27017 dbpath=/data/db/ 64-bit
Thu Sep 1 14:22:11 [initandlisten] db version v1.8.3, pdfile version 4.5
Thu Sep 1 14:22:11 [initandlisten] git version: c206d77e94bc3b65c76681df5a6b605f68a2de05
Thu Sep 1 14:22:11 [initandlisten] build sys info: Linux bs-linux64.10gen.cc 2.6.21.7-2.ec2.v1.2.fc8xen #1 SMP Fri Nov 20 17:48:28 EST 2009 x86_64 BOOST_LIB_VERSION=1_41
Thu Sep 1 14:22:11 [initandlisten] exception in initAndListen std::exception: dbpath (/data/db/) does not exist, terminating
Thu Sep 1 14:22:11 dbexit:
Thu Sep 1 14:22:11 [initandlisten] shutdown: going to close listening sockets...
Thu Sep 1 14:22:11 [initandlisten] shutdown: going to flush diaglog...
Thu Sep 1 14:22:11 [initandlisten] shutdown: going to close sockets...
Thu Sep 1 14:22:11 [initandlisten] shutdown: waiting for fs preallocator...
Thu Sep 1 14:22:11 [initandlisten] shutdown: closing all files...
Thu Sep 1 14:22:11 closeAllFiles() finished
Thu Sep 1 14:22:11 dbexit: really exiting now
~~~
And I startup database failed again in log file I found error
~~~
Thu Sep 1 14:22:02 [replslave] repl: AssertionException trying to slave off of a non-master
repl: sleep 2sec before next pass
Thu Sep 1 14:22:04 [replslave] repl: from host:sd2qvq10vl.saksdirect.com
Thu Sep 1 14:22:04 [replslave] trying to slave off of a non-master
Assertion: 13344:trying to slave off of a non-master
0x55f39a 0x6adde7 0x6ae2cd 0x6ae58f 0x6aea4e 0x6af2e5 0x8c1020 0x2b8c2355d73d 0x2b8c23fd84bd
/usr/bin/mongod(_ZN5mongo11msgassertedEiPKc+0x12a) [0x55f39a]
/usr/bin/mongod(_ZN5mongo10ReplSource14sync_pullOpLogERi+0x4087) [0x6adde7]
/usr/bin/mongod(_ZN5mongo10ReplSource4syncERi+0x4cd) [0x6ae2cd]
/usr/bin/mongod(_ZN5mongo9_replMainERSt6vectorIN5boost10shared_ptrINS_10ReplSourceEEESaIS4_EERi+0x1ff) [0x6ae58f]
/usr/bin/mongod(_ZN5mongo8replMainEv+0xce) [0x6aea4e]
/usr/bin/mongod(_ZN5mongo15replSlaveThreadEv+0x275) [0x6af2e5]
/usr/bin/mongod(thread_proxy+0x80) [0x8c1020]
/lib64/libpthread.so.0 [0x2b8c2355d73d]
/lib64/libc.so.6(clone+0x6d) [0x2b8c23fd84bd]
~~~
Can you help me to fix DBs?
Thanks.



 Comments   
Comment by Kristina Chodorow (Inactive) [ 02/Sep/11 ]

Yes, that would do it. Replication preallocates 5% of your disk space for the replication log. You can adjust this using --oplogSize, but it's generally a good idea to stick with the default.

Comment by Leonid Mikityanskiy [ 02/Sep/11 ]

Kristina,
I had 400 mb free in /var fs.
And I did not expected that it is not enough for MongoDB.
Now I have 1.8gb free and both dbs started successfully.
Leonid M.

Comment by Kristina Chodorow (Inactive) [ 01/Sep/11 ]

Thanks! It looks like your out of disk space on the master. If you look at the logs near the beginning, it tried to allocate the oplog on the master and errors out with:

Thu Sep  1 09:57:55 [FileAllocator] FileAllocator: posix_fallocate failed: errno:28 No space left on device falling back
Thu Sep  1 09:57:55 [FileAllocator] error failed to allocate new file: /var/lib/mongo/local.2 size: 1073741824 errno:28 No space left on device
Thu Sep  1 09:57:55 [initandlisten] Assertion: 12520:new file allocation failure
Thu Sep  1 09:57:55 [initandlisten] exception in initAndListen std::exception: new file allocation failure, terminating

Comment by Leonid Mikityanskiy [ 01/Sep/11 ]

Kristina,
I attached log and config files for your
review.
Thnaks.
Leonid M.

Comment by Kristina Chodorow (Inactive) [ 01/Sep/11 ]

Can you send the log from the master? It looks like it does not think it's master, are you sure it's looking at the right config file?

A couple of other things:

  • It looks like the repair did not happen because you did not specify the correct data path for it to repair.
  • You should not use the "source" option on the master. "source" is for slaves only.
Generated at Thu Feb 08 03:03:56 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.