Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-37108

Validate $exchange's number of buffers and buffer size limit to avoid OOM

    • Type: Icon: Improvement Improvement
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 4.1.4
    • Affects Version/s: None
    • Component/s: Aggregation Framework
    • Labels:
      None
    • Fully Compatible
    • Query 2018-09-24
    • 11

      $exchange accepts any integer value for the number of consumers and the size of the $exchange buffers. However, we don't perform any checks to ensure that the values aren't ridiculously large. This can lead to an OOM crash when $exchange continuously fills its buffers – with a large enough limit, it will continue to fill them such that the entire collection is now resident in memory inside the buffers.

      For the 4.2 release, we are not planning on exposing a way to configure $exchange's maximum buffer size to users for "real" exchange plans, and for now we will always default to 16 MiB. I have some arbitrary limits to suggest:

      • Set the maximum buffer size to be the default size (i.e. 16 MiB). Or perhaps make it a little bigger; say, 1024 MiB.
      • Set the maximum number of consumers to 100 or 200, based on some real-world data of the sizes of typical sharded clusters.

      Limiting the number of consumers might be contentious – if a user had a really large (300+ node) sharded cluster and wanted to perform a $out, the cluster aggregation planner would have to either reject an $exchange plan or set up ranges that do not match the actual sharding distribution.

      We could also consider not limiting the number of consumers, but also mandate that numConsumers x sizeOfEachBuffer doesn't exceed some maximum limit (10 GiB?).

            Assignee:
            nicholas.zolnierz@mongodb.com Nicholas Zolnierz
            Reporter:
            kyle.suarez@mongodb.com Kyle Suarez
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated:
              Resolved: