-
Type: Bug
-
Resolution: Incomplete
-
Priority: Major - P3
-
None
-
Affects Version/s: 2.0.5
-
Component/s: Replication
-
Linux
We had replication problems in our production environment with mongodb 1.8.2 after a server crash.
Over 12 TB data were synced until nearly finishing, then slave crashed.
Then we tested replication with mongodb 2.0.5-rc0. The database size is about a hundred DB.
The testing nodes were started with the following commands:
/usr/local/mongodb/bin/mongod --port 27020 --replSet jingoal --dbpath=/mongo_data/igoal1/ --logpath=/usr/local/mongodb/log/mongodb1.log --logappend --oplogSize 20000 --journal --fork
/usr/local/mongodb/bin/mongod --port 27021 --replSet jingoal --dbpath=/mongo_data/igoal4/ --logpath=/usr/local/mongodb/log/mongodb4.log --logappend --oplogSize 20000 --journal --fork
/usr/local/mongodb/bin/mongod --port 27022 --replSet jingoal --dbpath=/data2/igoal5/ --logpath=/usr/local/mongodb/log/mongodb5.log --logappend --oplogSize 20000 --journal --fork
Here is the error log from testing (2.0.5-rc0):
Fri May 11 20:21:52 [conn514] end connection 192.168.0.64:43413
Fri May 11 20:21:52 [initandlisten] connection accepted from 192.168.0.64:43417 #516
Fri May 11 20:22:19 Invalid access at address: 0x7fd8119ac000
Fri May 11 20:22:19 Got signal: 7 (Bus error).
Fri May 11 20:22:20 Backtrace:
0xa95df9 0xa9651c 0x7fecd0c21a80 0x7fecd0196feb 0x8b4239 0x8b576b 0x85aa85 0x5c183a 0x850885 0x852995 0x8542e9 0x82e76a 0x830323 0x828381 0x828438 0x8288d0 0xaabd90 0x7fecd
0c19fc7 0x7fecd01e959d
/usr/local/mongodb/bin/mongod(_ZN5mongo10abruptQuitEi+0x3a9) [0xa95df9]
/usr/local/mongodb/bin/mongod(_ZN5mongo24abruptQuitWithAddrSignalEiP7siginfoPv+0x22c) [0xa9651c]
/lib/libpthread.so.0 [0x7fecd0c21a80]
/lib/libc.so.6(memcpy+0x15b) [0x7fecd0196feb]
/usr/local/mongodb/bin/mongod(_ZN5mongo11DataFileMgr6insertEPKcPKvibbPb+0x679) [0x8b4239]
/usr/local/mongodb/bin/mongod(_ZN5mongo11DataFileMgr16insertWithObjModEPKcRNS_7BSONObjEb+0x4b) [0x8b576b]
/usr/local/mongodb/bin/mongod(_ZN5mongo6Cloner3FunclERNS_27DBClientCursorBatchIteratorE+0x545) [0x85aa85]
/usr/local/mongodb/bin/mongod(_ZN5mongo18DBClientConnection5queryEN5boost8functionIFvRNS_27DBClientCursorBatchIteratorEEEERKSsNS_5QueryEPKNS_7BSONObjEi+0x1aa) [0x5c183a]
/usr/local/mongodb/bin/mongod(_ZN5mongo6Cloner4copyEPKcS2_bbbbbbNS_5QueryE+0x3c5) [0x850885]
/usr/local/mongodb/bin/mongod(_ZN5mongo6Cloner2goEPKcRSsRKSsbbbbbbPi+0x1665) [0x852995]
/usr/local/mongodb/bin/mongod(_ZN5mongo9cloneFromEPKcRSsRKSsbbbbbbPi+0x59) [0x8542e9]
/usr/local/mongodb/bin/mongod(_ZN5mongo11ReplSetImpl18_syncDoInitialSyncEv+0xe6a) [0x82e76a]
/usr/local/mongodb/bin/mongod(_ZN5mongo11ReplSetImpl17syncDoInitialSyncEv+0x23) [0x830323]
/usr/local/mongodb/bin/mongod(_ZN5mongo11ReplSetImpl11_syncThreadEv+0x61) [0x828381]
/usr/local/mongodb/bin/mongod(_ZN5mongo11ReplSetImpl10syncThreadEv+0x48) [0x828438]
/usr/local/mongodb/bin/mongod(_ZN5mongo15startSyncThreadEv+0xa0) [0x8288d0]
/usr/local/mongodb/bin/mongod(thread_proxy+0x80) [0xaabd90]
/lib/libpthread.so.0 [0x7fecd0c19fc7]
/lib/libc.so.6(clone+0x6d) [0x7fecd01e959d]
Fri May 11 20:22:20 [initandlisten] connection accepted from 192.168.0.64:43421 #517
Fri May 11 20:22:20 [conn515] end connection 192.168.0.64:43415
Logstream::get called in uninitialized state
Fri May 11 20:22:20 ERROR: Client::~Client _context should be null but is not; client:rsSync
Logstream::get called in uninitialized state
Fri May 11 20:22:20 ERROR: Client::shutdown not called: rsSync
We have searched google, mongodb jira and forums, with some similar reportings, but no idea how to fix this problem.
So need your help, thanks.