Summarized from https://mongodb.slack.com/archives/C039SULHYGH/p1684257620795129
$search has custom logic for limit push down which is missing an optimization when occuring after $skip stages. a skip of 50 and limit of 50 should be pushed down as a limit of 50 to each shard and then limited again on mongos. Instead, each shard has to return a full batch.
Example 1: Sharded cluster comparing skip=0 to no skip
db.listings.aggregate([\{ $search: { queryString: { defaultPath: "data.host_type", query: "h*" }} }, \{$limit: 50}]): 249ms db.listings.aggregate([\{ $search: { queryString: { defaultPath: "data.host_type", query: "h*" }} }, \{$skip: 0}, \{$limit: 50}]): 752ms
Example 2: Sharded cluster, problem seems exacerbated by including a $project stage
db.listings.aggregate([{ $search: { queryString: { defaultPath: "data.host_type", query: "h*" }} }, { $project: { highlight: { "$meta": "searchScore" } }}, {$limit: 50}]) : 495ms db.listings.aggregate([{ $search: { queryString: { defaultPath: "data.host_type", query: "h*" }} }, { $project: { highlight: { "$meta": "searchScore" } }}, {$skip:0}, {$limit: 50}]): 12,528ms
Explain gist for query with {$skip:0} : https://gist.github.com/Edarke/c663b6f1acafa05af713781e4ce096eb
Explain gist for query without skip: https://gist.github.com/Edarke/5a423dc836a532d4b38228c421eac78c