Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-87735

Reconsider semantics of 'allowDiskUse: true by default' in context of a sharded cluster

    • Type: Icon: Improvement Improvement
    • Resolution: Unresolved
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • None
    • Query Execution

      In version 6.0, allowDiskUse: true was made to be the default setting on mongods (that is, allowDiskUse effectively became 'opt out'). In a sharded context, this behavior did not change. This is because mongos cannot spill to disk. Intuitively, this makes sense, but it can cause some surprising behavior. 

      Suppose I run a $group query against mongos. When mongos comes up with a distributed plan, somewhat surprisingly, it can wind up picking mongos as the node to execute the merging $group (instead of always picking some shard to do this). The result of this is that if the $group executing on mongos goes over the 100 MB limit, it will fail (whereas if it executed on mongod, it would've simply spilled to disk).

      This ticket proposes:

      • Having a command level default setting of allowDiskUse: true on mongos
      • Instead of 'allowDiskUse' meaning 'spill to disk' on mongos, it will instead mean 'only consider query plans that can guarantee spilling to disk' (that is, this setting should inhibit the execution of spilling aware stages on mongos).

            Assignee:
            backlog-query-execution [DO NOT USE] Backlog - Query Execution
            Reporter:
            mihai.andrei@mongodb.com Mihai Andrei
            Votes:
            0 Vote for this issue
            Watchers:
            18 Start watching this issue

              Created:
              Updated: