Hello,
We noticed that during the sort of chunks used in ShardedPartitioner, we had issues such as
Executor error during find command :: caused by :: Sort exceeded memory limit of 104857600 bytes, but did not opt in to external sorting
We saw an issue was raised in SPARK-355 that fixed some aggregation queries used in other partitioner.
Starting from Mongo 4.4, it's possible to use the allowDiskUse on cursor, so we could change that piece of code from ShardedPartitioner to use the allowDiskUse option from read config
client
.getDatabase(CONFIG_DATABASE)
.getCollection(CONFIG_CHUNKS, BsonDocument.class)
.find(chunksMatchPredicate)
.projection(CHUNKS_PROJECTIONS)
.sort(SORTS)
.into(new ArrayList<>()));
Thanks for your work!
- related to
-
SPARK-355 ReadConfig.AllowDiskUse not applied to all pipelines
- Closed