-
Type:
Bug
-
Resolution: Done
-
Priority:
Major - P3
-
None
-
Affects Version/s: 1.8.0-rc2
-
Component/s: None
-
None
-
ALL
-
None
-
3
-
None
-
None
-
None
-
None
-
None
-
None
Fri Mar 18 18:05:05 [LockPinger] dist_lock pinged successfully for: OBE017.local:1300448993:1804289383
Fri Mar 18 18:06:06 [mongosMain] connection accepted from 62.212.84.220:60590 #34
Fri Mar 18 18:10:05 [LockPinger] dist_lock pinged successfully for: OBE017.local:1300448993:1804289383
Fri Mar 18 18:15:05 [LockPinger] dist_lock pinged successfully for: OBE017.local:1300448993:1804289383
Fri Mar 18 18:18:09 [conn24] autosplitted crawler.ad shard: ns:crawler.ad at: partitionB:partitionB/flash,gordon:27018 lastmod: 138|0 min:
max:
{ _id: 662009883156 }on:
{ _id: 10000068480361 }(splitThreshold 209715200)
Fri Mar 18 18:18:09 [conn24] update failed b/c of StaleConfigException, retrying left:4 ns: crawler.ad query:
Fri Mar 18 18:18:11 [conn28] ns: crawler.ad ClusteredCursor::query ShardConnection had to change attempt: 0
Fri Mar 18 18:18:11 [conn28] SyncClusterConnection connecting to [flash:27019]
Fri Mar 18 18:18:11 [conn28] SyncClusterConnection connecting to [gordon:27019]
Fri Mar 18 18:18:11 [conn28] SyncClusterConnection connecting to [mara:27019]
Fri Mar 18 18:18:11 [conn28] SyncClusterConnection connecting to [flash:27019]
Fri Mar 18 18:18:11 [conn28] SyncClusterConnection connecting to [gordon:27019]
Fri Mar 18 18:18:11 [conn28] SyncClusterConnection connecting to [mara:27019]
Fri Mar 18 18:18:11 [conn28] setShardVersion failed:
Fri Mar 18 18:18:11 [conn28] Assertion: 10429:setShardVersion failed!
{ "errmsg" : "not master", "ok" : 0 }0x522c19 0x69ee23 0x69e90b 0x69e90b 0x69e90b 0x5799e6 0x57a2f6 0x57781d 0x6680b6 0x67ae11 0x5817a2 0x6a26f0 0x30e920673d 0x30e8ad3f6d
/opt/mongo/bin/mongos(_ZN5mongo11msgassertedEiPKc+0x129) [0x522c19]
/opt/mongo/bin/mongos [0x69ee23]
/opt/mongo/bin/mongos [0x69e90b]
/opt/mongo/bin/mongos [0x69e90b]
/opt/mongo/bin/mongos [0x69e90b]
/opt/mongo/bin/mongos(_ZN5boost6detail8function17function_invoker4IPFbRN5mongo12DBClientBaseERKSsbiEbS5_S7_biE6invokeERNS1_15function_bufferES5_S7_bi+0x16) [0x5799e6]
/opt/mongo/bin/mongos(_ZN5mongo17ClientConnections13checkVersionsERKSs+0x1c6) [0x57a2f6]
/opt/mongo/bin/mongos(_ZN5mongo15ShardConnection25checkMyConnectionVersionsERKSs+0x3d) [0x57781d]
/opt/mongo/bin/mongos(_ZN5mongo7Request7processEi+0x5e6) [0x6680b6]
/opt/mongo/bin/mongos(_ZN5mongo21ShardedMessageHandler7processERNS_7MessageEPNS_21AbstractMessagingPortE+0x141) [0x67ae11]
/opt/mongo/bin/mongos(_ZN5mongo3pms9threadRunEPNS_13MessagingPortE+0x272) [0x5817a2]
/opt/mongo/bin/mongos(thread_proxy+0x80) [0x6a26f0]
/lib64/libpthread.so.0 [0x30e920673d]
/lib64/libc.so.6(clone+0x6d) [0x30e8ad3f6d]
Fri Mar 18 18:18:11 [conn28] AssertionException in process: setShardVersion failed!
Fri Mar 18 18:18:11 [conn28] ns: crawler.ad ClusteredCursor::query ShardConnection had to change attempt: 0
Fri Mar 18 18:18:11 [conn28] SyncClusterConnection connecting to [flash:27019]
Fri Mar 18 18:18:11 [conn28] SyncClusterConnection connecting to [gordon:27019]
Fri Mar 18 18:18:11 [conn28] SyncClusterConnection connecting to [mara:27019]
Fri Mar 18 18:18:11 [conn28] setShardVersion failed:
Fri Mar 18 18:18:11 [conn28] Assertion: 10429:setShardVersion failed!
{ "errmsg" : "not master", "ok" : 0 }0x522c19 0x69ee23 0x69e90b 0x69e90b 0x69e90b 0x5799e6 0x57a2f6 0x57781d 0x6680b6 0x67ae11 0x5817a2 0x6a26f0 0x30e920673d 0x30e8ad3f6d
/opt/mongo/bin/mongos(_ZN5mongo11msgassertedEiPKc+0x129) [0x522c19]
/opt/mongo/bin/mongos [0x69ee23]
/opt/mongo/bin/mongos [0x69e90b]
/opt/mongo/bin/mongos [0x69e90b]
/opt/mongo/bin/mongos [0x69e90b]
/opt/mongo/bin/mongos(_ZN5boost6detail8function17function_invoker4IPFbRN5mongo12DBClientBaseERKSsbiEbS5_S7_biE6invokeERNS1_15function_bufferES5_S7_bi+0x16) [0x5799e6]
/opt/mongo/bin/mongos(_ZN5mongo17ClientConnections13checkVersionsERKSs+0x1c6) [0x57a2f6]
/opt/mongo/bin/mongos(_ZN5mongo15ShardConnection25checkMyConnectionVersionsERKSs+0x3d) [0x57781d]
/opt/mongo/bin/mongos(_ZN5mongo7Request7processEi+0x5e6) [0x6680b6]
... ad infinitum. Until restarted.
On partitionB mongod log I've found this around the same time:
...
Fri Mar 18 18:18:09 [conn1195] end connection 95.211.11.30:33282
Fri Mar 18 18:18:09 [conn1200] end connection 95.211.11.30:33292
Fri Mar 18 18:18:09 [conn1366] end connection 95.211.11.30:49865
Fri Mar 18 18:18:09 [initandlisten] connection accepted from 95.211.11.30:35538 #1498
Fri Mar 18 18:18:09 [conn1498] request split points lookup for chunk crawler.ad { : 662009843887 } -->> { : 10000068480361 }
Fri Mar 18 18:18:09 [initandlisten] connection accepted from 94.75.192.38:33921 #1499
Fri Mar 18 18:18:09 [conn1499] end connection 94.75.192.38:33921
Fri Mar 18 18:18:09 [conn1498] max number of requested split points reached (2) before the end of chunk crawler.ad { : 662009843887 } -->> { : 10000068480361 }
Fri Mar 18 18:18:09 [conn1498] warning: Finding the split vector for crawler.ad over
keyCount: 39243 numSplits: 2 lookedAt: 0 took 117ms
Fri Mar 18 18:18:09 [conn1498] query admin.$cmd ntoreturn:1 command: { splitVector: "crawler.ad", keyPattern:
, min:
{ _id: 662009843887 }, max:
{ _id: 10000068480361 }, maxChunkSizeBytes: 209715200, maxSplitPoints: 2, maxChunkObjects: 250000 } reslen:111 117ms
Fri Mar 18 18:18:09 [conn1498] SyncClusterConnection connecting to [flash:27019]
Fri Mar 18 18:18:09 [conn1498] SyncClusterConnection connecting to [gordon:27019]
Fri Mar 18 18:18:09 [conn1498] SyncClusterConnection connecting to [mara:27019]
Fri Mar 18 18:18:09 [conn1498] updated set (partitionB) to: partitionB/flash,gordon:27018
Fri Mar 18 18:18:09 [initandlisten] connection accepted from 95.211.11.30:35543 #1500
Fri Mar 18 18:18:09 [initandlisten] connection accepted from 95.211.11.30:35544 #1501
Fri Mar 18 18:18:09 [conn1498] received splitChunk request: { splitChunk: "crawler.ad", keyPattern:
, min:
{ _id: 662009843887 }, max:
{ _id: 10000068480361 }, from: "partitionB/flash,gordon:27018", splitKeys: [
{ _id: 662009883156 } ], shardId: "crawler.ad-_id_662009843887", configdb: "flash:27019,gordon:27019,mara:27019" }
Fri Mar 18 18:18:09 [LockPinger] creating dist lock ping thread for: flash:27019,gordon:27019,mara:27019
Fri Mar 18 18:18:09 [LockPinger] SyncClusterConnection connecting to [flash:27019]
Fri Mar 18 18:18:09 [LockPinger] SyncClusterConnection connecting to [gordon:27019]
Fri Mar 18 18:18:09 [LockPinger] SyncClusterConnection connecting to [mara:27019]
Fri Mar 18 18:18:09 [conn1498] splitChunk accepted at version 138|0
Fri Mar 18 18:18:09 [conn1498] about to log metadata event: { _id: "OBE017.local-2011-03-18T17:18:09-1", server: "OBE017.local", clientAddr: "95.211.11.30:35538", time: new Date(1300468689635), what: "split", ns: "crawler.ad", details: { before: { min:
, max:
{ _id: 10000068480361 }, lastmod: Timestamp 138000|0 }, left: { min:
{ _id: 662009843887 }, max:
{ _id: 662009883156 }, lastmod: Timestamp 138000|2 }, right: { min:
{ _id: 662009883156 }, max:
{ _id: 10000068480361 }, lastmod: Timestamp 138000|3 } } }
Fri Mar 18 18:18:09 [conn1498] query admin.$cmd ntoreturn:1 command: { splitChunk: "crawler.ad", keyPattern:
, min:
{ _id: 662009843887 }, max:
{ _id: 10000068480361 }, from: "partitionB/flash,gordon:27018", splitKeys: [
{ _id: 662009883156 } ], shardId: "crawler.ad-_id_662009843887", configdb: "flash:27019,gordon:27019,mara:27019" } reslen:53 487ms
Fri Mar 18 18:19:00 [conn1191] insert crawler.ad 390ms
Fri Mar 18 18:19:24 [conn159] query admin.$cmd ntoreturn:1 command:
reslen:60 300048ms
Fri Mar 18 18:19:24 [conn163] query admin.$cmd ntoreturn:1 command:
reslen:60 300048ms
Fri Mar 18 18:19:56 [conn1183] query admin.$cmd ntoreturn:1 command:
reslen:60 300048ms
Fri Mar 18 18:20:25 [initandlisten] connection accepted from 94.75.192.38:33966 #1502
Fri Mar 18 18:20:25 [conn1502] end connection 94.75.192.38:33966
Fri Mar 18 18:20:34 [initandlisten] connection accepted from 94.75.192.38:33980 #1503
Fri Mar 18 18:20:34 [conn1503] end connection 94.75.192.38:33980
Fri Mar 18 18:21:54 [conn1187] insert crawler.ad 278ms
Fri Mar 18 18:22:30 [conn5] query admin.$cmd ntoreturn:1 command:
reslen:60 300047ms
Fri Mar 18 18:22:39 [LockPinger] dist_lock pinged successfully for: OBE017.local:1300468689:1182095420
Fri Mar 18 18:22:52 [conn1190] insert crawler.ad 154ms
...
Then, for some time (I think until mongos restart, but not sure) keeps saying this mixed with normal logs:
Fri Mar 18 18:46:45 [conn1529] request split points lookup for chunk crawler.ad { : 10662019454956 } -->> { : MaxKey }
Fri Mar 18 18:46:47 [conn1529] request split points lookup for chunk crawler.ad { : 10662019454956 } -->> { : MaxKey }
Fri Mar 18 18:46:47 [conn1529] request split points lookup for chunk crawler.ad { : 10662019454956 } -->> { : MaxKey }
Fri Mar 18 18:46:49 [conn1529] request split points lookup for chunk crawler.ad { : 10662019454956 } -->> { : MaxKey }
...
But looks like no error appears anywhere on mongod logs (No error neither on config server logs)
I'll try to use 1.8.0 instead of RC2, let's see.
Post you if any change...