Details
-
Bug
-
Resolution: Cannot Reproduce
-
Major - P3
-
None
-
None
-
None
-
ALL
Description
I like to archive my data as described in Tiered Hardware for Varying SLA or SLO
My sharded cluster looks like this:
db.getSiblingDB("config").shards.find({}, { tags: 1 }) |
{ "_id" : "shard_01", "tags" : ["recent"] } |
{ "_id" : "shard_02", "tags" : ["recent"] } |
{ "_id" : "shard_03", "tags" : ["recent"] } |
{ "_id" : "shard_04", "tags" : ["archive"] } |
|
|
db.getSiblingDB("config").collections.find({ _id: "data.sessions.20210412.zoned" }, { key: 1 }) |
{
|
"_id": "data.sessions.20210412.zoned", |
"key": { "tsi": 1.0, "si": 1.0 } |
}
|
|
|
db.getSiblingDB("data").getCollection("sessions.20210412.zoned").getShardDistribution() |
|
|
Shard shard_03 at shard_03/d-mipmdb-sh1-03.swi.srse.net:27018,d-mipmdb-sh2-03.swi.srse.net:27018 |
data : 63.18GiB docs : 16202743 chunks : 2701 |
estimated data per chunk : 23.95MiB |
estimated docs per chunk : 5998 |
|
|
Shard shard_02 at shard_02/d-mipmdb-sh1-02.swi.srse.net:27018,d-mipmdb-sh2-02.swi.srse.net:27018 |
data : 55.6GiB docs : 14259066 chunks : 2367 |
estimated data per chunk : 24.05MiB |
estimated docs per chunk : 6024 |
|
|
Shard shard_01 at shard_01/d-mipmdb-sh1-01.swi.srse.net:27018,d-mipmdb-sh2-01.swi.srse.net:27018 |
data : 68.92GiB docs : 23896624 chunks : 3034 |
estimated data per chunk : 23.26MiB |
estimated docs per chunk : 7876 |
|
|
Totals
|
data : 187.72GiB docs : 54358433 chunks : 8102 |
Shard shard_03 contains 33.66% data, 29.8% docs in cluster, avg obj size on shard : 4KiB |
Shard shard_02 contains 29.62% data, 26.23% docs in cluster, avg obj size on shard : 4KiB |
Shard shard_01 contains 36.71% data, 43.96% docs in cluster, avg obj size on shard : 3KiB |
|
In order to trigger migration I use
sh.disableBalancing('data.sessions.20210412.zoned') |
if (db.getSiblingDB("config").migrations.findOne({ ns: 'data.sessions.20210412.zoned' }) == null) { |
sh.updateZoneKeyRange('data.sessions.20210412.zoned', { "tsi": MinKey, "si": MinKey }, { "tsi": MaxKey, "si": MaxKey }, null) |
sh.updateZoneKeyRange('data.sessions.20210412.zoned', { "tsi": MinKey, "si": MinKey }, { "tsi": MaxKey, "si": MaxKey }, 'archive') |
}
|
sh.enableBalancing('data.sessions.20210412.zoned') |
|
I don't get any error and migration starts. However, in my logs (at config server) I get thousands or even millions of these warnings:
{
|
"t": { |
"$date": "2021-04-15T14:56:28.984+02:00" |
},
|
"s": "W", |
"c": "SHARDING", |
"id": 21892, |
"ctx": "Balancer", |
"msg": "Chunk violates zone, but no appropriate recipient found", |
"attr": { |
"chunk": "{ ns: \"data.sessions.20210412.zoned\", min: { tsi: \"194.230.147.157\", si: \"10.38.15.1\" }, max: { tsi: \"194.230.147.157\", si: \"10.40.230.198\" }, shard: \"shard_03\", lastmod: Timestamp(189, 28), lastmodEpoch: ObjectId('60780e581ad069faafa363ba'), jumbo: false }", |
"zone": "archive" |
}
|
}
|
The file system reached 100% and MongoDB stopped working.
How can this be? `MinKey` / `MayKey` should cover all values.