When using the async driver, we have found we cannot limit the amount of concurrent queries executed in the async execution model.
Although we'd assumed limiting the amount of connections in the connection pool would limit the amount of queries possible in flight, it turns out, this is not how it works out all the time. When the query returns a larger amount of results than the batch size (20 by default), the retrieval of results in the async model will yield the connection back to the pool after each batch has executed. In this case, an additional query request may start, and the number of concurrent queries may become larger than the connection pool.
We need a model to limit the amount of concurrent queries possible to prevent memory issues when too many queries would be processed at once. This a common use case, it would be very useful to support directly in the mongo async driver.