Details
-
Bug
-
Resolution: Done
-
Major - P3
-
None
-
5.0.12, 5.0.9
-
None
-
None
-
ALL
-
Description
When trying to reshard a collection we are getting an error message telling that the sort limit is exceeded.
The resarding command is being run as follows:
[direct: mongos] admin> sh.reshardCollection('database.messages', {_id: 'hashed'}) |
MongoServerError: Sort exceeded memory limit of 104857600 bytes, but did not opt in to external sorting. |
The collection contains approx. 2.4B documents split across 10 shards.
Shard distribution looks like this:
[direct: mongos] database> db.messages.getShardDistribution()
|
Shard rs06 at rs06/mongo-pm-repl06-1:27018,mongo-pm06:27018 |
{
|
data: '78.1GiB', |
docs: 360057505, |
chunks: 5669, |
'estimated data per chunk': '14.1MiB', |
'estimated docs per chunk': 63513 |
}
|
---
|
Shard shard0001 at rs02/DS5033:27018,mongo-pm-repl02-1:27018 |
{
|
data: '49.25GiB', |
docs: 245035457, |
chunks: 5670, |
'estimated data per chunk': '8.89MiB', |
'estimated docs per chunk': 43216 |
}
|
---
|
Shard shard0000 at rs01/DS5032:27018,mongo-pm-repl01-1:27018 |
{
|
data: '43.17GiB', |
docs: 217760699, |
chunks: 5669, |
'estimated data per chunk': '7.79MiB', |
'estimated docs per chunk': 38412 |
}
|
---
|
Shard rs09 at rs09/mongo-pm-repl09-1:27018,mongo-pm09:27018 |
{
|
data: '47.01GiB', |
docs: 220116142, |
chunks: 5669, |
'estimated data per chunk': '8.49MiB', |
'estimated docs per chunk': 38828 |
}
|
---
|
Shard rs03 at rs03/mongo-pm-repl03-1:27018,mongo-pm03:27018 |
{
|
data: '37.81GiB', |
docs: 190704378, |
chunks: 5670, |
'estimated data per chunk': '6.82MiB', |
'estimated docs per chunk': 33633 |
}
|
---
|
Shard rs05 at rs05/DS5085:27018,mongo-pm-repl05-1:27018 |
{
|
data: '36.7GiB', |
docs: 185097348, |
chunks: 5669, |
'estimated data per chunk': '6.63MiB', |
'estimated docs per chunk': 32650 |
}
|
---
|
Shard rs07 at rs07/mongo-pm-repl07-1:27018,mongo-pm07:27018 |
{
|
data: '88.01GiB', |
docs: 413919369, |
chunks: 5669, |
'estimated data per chunk': '15.89MiB', |
'estimated docs per chunk': 73014 |
}
|
---
|
Shard rs04 at rs04/mongo-pm-repl04-1:27018,mongo-pm04:27018 |
{
|
data: '38.33GiB', |
docs: 193215609, |
chunks: 5677, |
'estimated data per chunk': '6.91MiB', |
'estimated docs per chunk': 34034 |
}
|
---
|
Shard rs08 at rs08/mongo-pm-repl08-1:27018,mongo-pm08:27018 |
{
|
data: '45.07GiB', |
docs: 207666324, |
chunks: 5668, |
'estimated data per chunk': '8.14MiB', |
'estimated docs per chunk': 36638 |
}
|
---
|
Shard rs10 at rs10/mongo-pm-repl10-1:27018,mongo-pm10:27018 |
{
|
data: '37.04GiB', |
docs: 178733333, |
chunks: 2332, |
'estimated data per chunk': '16.26MiB', |
'estimated docs per chunk': 76643 |
}
|
---
|
Totals
|
{
|
data: '7.81030509569951e+100GiB', |
docs: 2412306164, |
chunks: 53362, |
'Shard rs06': [ |
'0 % data', |
'14.92 % docs in cluster', |
'232B avg obj size on shard' |
],
|
'Shard shard0001': [ |
'0 % data', |
'10.15 % docs in cluster', |
'215B avg obj size on shard' |
],
|
'Shard shard0000': [ |
'0 % data', |
'9.02 % docs in cluster', |
'212B avg obj size on shard' |
],
|
'Shard rs09': [ |
'0 % data', |
'9.12 % docs in cluster', |
'229B avg obj size on shard' |
],
|
'Shard rs03': [ '0 % data', '7.9 % docs in cluster', '212B avg obj size on shard' ], |
'Shard rs05': [ |
'0 % data', |
'7.67 % docs in cluster', |
'212B avg obj size on shard' |
],
|
'Shard rs07': [ |
'0 % data', |
'17.15 % docs in cluster', |
'228B avg obj size on shard' |
],
|
'Shard rs04': [ '0 % data', '8 % docs in cluster', '213B avg obj size on shard' ], |
'Shard rs08': [ '0 % data', '8.6 % docs in cluster', '233B avg obj size on shard' ], |
'Shard rs10': [ '0 % data', '7.4 % docs in cluster', '222B avg obj size on shard' ] |
}
|
The only reason I see for such behavior is that reshardCollection command does some preliminary queries/aggregation involving sorting for further work, like counting documents to copy etc., and during this phase something sort memory limit gets exceeded due to a large number of documents or chunks in the collection.
Probably those sort operations need to be done with allowDiskUse option set to tru
Attachments
Issue Links
- is related to
-
SERVER-68139 Resharding command fails if the projection sort is bigger than 100MB
-
- Closed
-