Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-4716

reexamine scan and order memory limit handling

    • Type: Icon: Bug Bug
    • Resolution: Won't Fix
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: Querying
    • Labels:
      None
    • ALL

      Background

      An in memory sort is used to reorder fields when a query order specification is inconsistent with the ordering provided by a cursor. The implementation uses an in memory map to store a bson sort key and a bson return document associated with the key. The keys and their associated documents may move in and out of the map in the course of determining a "top n" ordering. A memory footprint cap is implemented such that if at any time the cumulative storage size of the bson keys and bson documents contained within the map exceeds a fixed threshold of 32mb, a user assertion is triggered.

      When a query is executed, the results from all unordered plans are iterated in an interleaved fashion. These results are deduped and all are sent through the same in memory sort map. If the in memory sort map triggers a user assertion due to reaching the memory footprint cap, one of the following will occur:
      1) If all candidate query plans are unordered, the query will be aborted with an error message related to the memory footprint cap sent to the client.
      2) If some candidate query plans are ordered and some are not, but a single cached unordered query plan is running alone, the query plan cache will be cleared and the query will be retried using all candidate plans.
      3) If some candidate query plans are ordered and some are not, and all candidate plans are running (rather than a single cached plan), the unordered query plans will be aborted but the ordered ones will continue running. On completion of the query, one of the ordered query plans will be recorded in the plan cache for the query's query pattern.

      Potential Concerns

      • A) The behavior when running the same query against the same data and same set of indexes is not deterministic. The query may or may not fail with an in memory sort assertion depending on which indexes are used, which may depend on plan caching, hints, or the order in which indexes were created. It may also depend on the order in which documents are read, which can vary based on disk storage layout. The same query may work when run once but fail with a memory assertion when run a second time. A query may work on a primary but fail with a memory assertion on a secondary.
      • B) A memory size assertion may cause the query optimizer to fall back on an ordered query plan that performs very poorly. This query plan may be recorded in the query plan cache and used in cases where the unordered plan could run without a memory assertion. See SERVER-1021.

      Aaron

      ----------------------------------------------------------------------

      This means if later documents seen by scanandorder are larger than earlier ones their sizes may not included in the size calculation. This can lead to very large memory allocations in pathological cases.

      test

      c = db.c;
      c.drop();
      
      for( i = 0; i < 1000; ++i ) {
          c.save( {a:1} );
      }
      
      big = new Array( 10 * 1000 * 1000 ).toString();
      
      for( i = 0; i < 1000; ++i ) {
          c.save( {a:0,big:big} );
      }
      
      c.find().sort( {a:1} ).limit( 1000 ).itcount();
      

      log

      Wed Jan 18 17:17:06 [clientcursormon] mem (MB) res:1000 virt:27930 mapped:12235
      Wed Jan 18 17:19:06 [clientcursormon] mem (MB) res:1029 virt:31012 mapped:12235
      Wed Jan 18 17:19:06 [clientcursormon] warning virtual/mapped memory differential is large. journaling:1
      Wed Jan 18 17:21:06 [clientcursormon] mem (MB) res:902 virt:34369 mapped:12235
      Wed Jan 18 17:21:40 [conn1] assertion 10129 too much data for sort() with no index ns:test.c query:{ query: {}, orderby: { a: 1.0 } }
      Wed Jan 18 17:21:40 [conn1]  ntoskip:0 ntoreturn:1000
      Wed Jan 18 17:21:40 [conn1] { $err: "too much data for sort() with no index", code: 10129 }
      Wed Jan 18 17:21:40 [conn1] query test.c ntoreturn:1000 keyUpdates:0 exception: too much data for sort() with no index code:10129 reslen:84 331507ms
      

            Assignee:
            aaron Aaron Staple
            Reporter:
            aaron Aaron Staple
            Votes:
            3 Vote for this issue
            Watchers:
            8 Start watching this issue

              Created:
              Updated:
              Resolved: