|
HI,I also face this issue。It happened for four times this month。
this error msg:
setShardVersion failed host: mongo4.mcloud.139.com:20004
{ oldVersion: Timestamp 0|0, oldVersionEpoch: ObjectId('000000000000000000000000'), ns: "mcloud.m_iosyncdetaillog", version: Timestamp 560000|381, versionEpoch: ObjectId('51d37afa0b8c2dd3569f5ed6'), globalVersion: Timestamp 561000|0, globalVersionEpoch: ObjectId('51d37afa0b8c2dd3569f5ed6'), reloadConfig: true, errmsg: "shard global version for collection is higher than trying to set to 'mcloud.m_iosyncdetaillog'", ok: 0.0 }
cause by "One mongos crashed when moveChunk jobs were running. The moveChunk jobs were created one by one on the another mongos node manually" ??? who can reply it?
thank you...
|
|
@Eliot, yes, the mongos crashed. Here are some additional infos of this issue:
1. the cluster topology for this issue: 2 mongos nodes, several replica sets.
2. One mongos crashed when moveChunk jobs were running. The moveChunk jobs were created one by one on the another mongos node manually.
3. It happened for several times, and should be reproduceable.
4. chensi has attached the mongos log and the gdb bt info. And if needed we can provide you the coredump file. Let me know if you want any other detail.
appreciate your reply. thanks a lot
|
|
gdb bt output:
#0 0x000000302c805f4f in ?? () from /lib64/libgcc_s.so.1
#1 0x000000302c806df7 in _Unwind_Backtrace () from /lib64/libgcc_s.so.1
#2 0x000000302afd73cf in backtrace () from /lib64/tls/libc.so.6
#3 0x000000000074d290 in formattedBacktrace (signalNum=11) at src/mongo/util/signal_handlers.cpp:93
#4 mongo::printStackAndExit (signalNum=11) at src/mongo/util/signal_handlers.cpp:115
#5 <signal handler called>
#6 0x000000302c805f4f in ?? () from /lib64/libgcc_s.so.1
#7 0x000000302c806df7 in _Unwind_Backtrace () from /lib64/libgcc_s.so.1
#8 0x000000302afd73cf in backtrace () from /lib64/tls/libc.so.6
#9 0x000000000074e663 in mongo::printStackTrace (os=...) at src/mongo/util/stacktrace.cpp:38
#10 0x000000000071af94 in mongo::msgasserted (msgid=10429,
msg=0x7eff4c816a18 "setShardVersion failed host: 10.38.171.25:7111 { oldVersion: Timestamp 29362000|0, oldVersionEpoch: ObjectId('522ad499e1814e603d11be30'), ns: \"appid250528.meta_infos0\", version: Timestamp 29362000|0, "...)
at src/mongo/util/assert_util.cpp:153
#11 0x000000000071b01c in mongo::msgasserted (msgid=<value optimized out>, msg=<value optimized out>)
at src/mongo/util/assert_util.cpp:145
#12 0x00000000006cd579 in mongo::checkShardVersion (conn_in=0x7eff59195b80, ns=..., refManager=...,
authoritative=true, tryNumber=7) at src/mongo/s/shard_version.cpp:285
#13 0x00000000006ccd04 in mongo::checkShardVersion (conn_in=0x7eff59195b80, ns=..., refManager=...,
authoritative=true, tryNumber=6) at src/mongo/s/shard_version.cpp:279
#14 0x00000000006ccd04 in mongo::checkShardVersion (conn_in=0x7eff59195b80, ns=..., refManager=...,
authoritative=true, tryNumber=5) at src/mongo/s/shard_version.cpp:279
#15 0x00000000006ccd04 in mongo::checkShardVersion (conn_in=0x7eff59195b80, ns=..., refManager=...,
authoritative=true, tryNumber=4) at src/mongo/s/shard_version.cpp:279
#16 0x00000000006ccd04 in mongo::checkShardVersion (conn_in=0x7eff59195b80, ns=..., refManager=...,
authoritative=true, tryNumber=3) at src/mongo/s/shard_version.cpp:279
#17 0x00000000006ccd04 in mongo::checkShardVersion (conn_in=0x7eff59195b80, ns=..., refManager=...,
authoritative=true, tryNumber=2) at src/mongo/s/shard_version.cpp:279
#18 0x00000000006ccbdb in mongo::checkShardVersion (conn_in=0x7eff59195b80, ns=..., refManager=...,
authoritative=false, tryNumber=1) at src/mongo/s/shard_version.cpp:253
#19 0x00000000006cd6bc in mongo::VersionManager::checkShardVersionCB (this=<value optimized out>,
conn_in=0x7eff54b81900, authoritative=false, tryNumber=1) at src/mongo/s/shard_version.cpp:294
#20 0x00000000006ced23 in mongo::ShardConnection::_finishInit (this=0x7eff54b81900)
at src/mongo/s/shardconnection.cpp:336
#21 0x000000000058c48b in setVersion (this=0x7eff50acc380, state=..., shard=<value optimized out>,
primary=<value optimized out>, ns=<value optimized out>, vinfo=..., manager=...)
at src/mongo/client/../s/shard.h:266
#22 mongo::ParallelSortClusteredCursor::setupVersionAndHandleSlaveOk (this=0x7eff50acc380, state=...,
shard=<value optimized out>, primary=<value optimized out>, ns=<value optimized out>, vinfo=..., manager=...)
at src/mongo/client/parallel.cpp:736
#23 0x000000000059a15c in mongo::ParallelSortClusteredCursor::startInit (this=0x7eff50acc380)
at src/mongo/client/parallel.cpp:894
#24 0x000000000059e2c9 in mongo::ParallelSortClusteredCursor::fullInit (this=0x7eff50acc380)
at src/mongo/client/parallel.cpp:654
#25 0x00000000006def9c in mongo::ShardStrategy::queryOp(mongo::Request&) ()
#26 0x00000000006c3918 in mongo::Request::process (this=0x674cae20, attempt=0) at src/mongo/s/request.cpp:140
#27 0x000000000053ed92 in mongo::ShardedMessageHandler::process (this=<value optimized out>, m=..., p=0x7eff55e6c700,
le=0x7eff55cc3380) at src/mongo/s/server.cpp:104
#28 0x000000000073cc31 in mongo::pms::threadRun (inPort=0x7eff55e6c700)
at src/mongo/util/net/message_server_port.cpp:85
#29 0x000000302b80610a in start_thread () from /lib64/tls/libpthread.so.0
#30 0x000000302afc6003 in clone () from /lib64/tls/libc.so.6
#31 0x0000000000000000 in ?? ()
|