Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-20336

O(N^2) perf regression in listCollections and similar code paths [BLOCKING Mongo 3.0 Adoption]

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Blocker - P1
    • Resolution: Duplicate
    • Affects Version/s: 3.0.5
    • Fix Version/s: None
    • Component/s: Storage
    • Labels:
      None
    • Operating System:
      ALL
    • Steps To Reproduce:
      Hide

      Run the following script using a mongo3 shell against a mongo3 mongod. I've reproduced against 3.0.4, 3.0.5, and 3.0.6.

      conn = new Mongo();
      db = conn.getDB("dummy");
      var N = 16000;
       
      print('Creating collections');
      for(var i = 0; i < N; i++) {
        db.createCollection('collection' + i);
      }
       
      print('Listing collections.');
      var start = Date.now();
      db.getCollectionNames();
      print('Time to list collections: ', Date.now() - start);
      

      You should see a time of ~15-30 seconds for the getCollectionNames() call, and it gets much worse as N increases, since it's O(N^2). If you run the same script on mongo 2.6, it will complete in < a second, even for large values of N.

      You'll also see a hang if you restart your mongod server, or do a mongodump, and probably many other operations.

      While db.getCollectionNames() is in progress, any writes will be blocked.

      Show
      Run the following script using a mongo3 shell against a mongo3 mongod. I've reproduced against 3.0.4, 3.0.5, and 3.0.6. conn = new Mongo(); db = conn.getDB("dummy"); var N = 16000;   print('Creating collections'); for(var i = 0; i < N; i++) { db.createCollection('collection' + i); }   print('Listing collections.'); var start = Date.now(); db.getCollectionNames(); print('Time to list collections: ', Date.now() - start); You should see a time of ~15-30 seconds for the getCollectionNames() call, and it gets much worse as N increases, since it's O(N^2). If you run the same script on mongo 2.6, it will complete in < a second, even for large values of N. You'll also see a hang if you restart your mongod server, or do a mongodump, and probably many other operations. While db.getCollectionNames() is in progress, any writes will be blocked.

      Description

      Mongo 3 has an O(N^2) perf issue where N is the number of collections in a database. This is a regression from 2.x. For our dataset this causes a ~15 minutes hang, making mongo 3 completely unusable.

      The hang can be hit in many ways, including:
      1. When calling db.getCollectionNames().
      2. When starting mongod.
      3. When doing a mongodump.
      4. When a secondary mongod server in a replica set transitions to be primary.

      The O(N^2) nature can be clearly seen by this chart showing measured time to perform a db.getCollectionNames() for a given number of collections. There's also an attached graph showing a quadratic best fit.

      Number Collections (in 1000s) list collections time (seconds)
      1 0.164
      2 0.464
      4 1.6
      8 6.1
      16 24
      32 95
      64 439

      Context:

      • We have a multi-tenant system, where each tenant is served by a new collection.
      • As a result, we have on the order of 100,000 collections in a database.
      • We started upgrading to mongo 3 in our production environment, but ran into this issue (fortunately before we upgraded the primary) and had to do an emergency rollback to 2.6.
      • We're now stuck on 2.6 for the moment, but extremely eager to get the benefits of mongo 3 to address pressing issues in production.

      Can you please acknowledge this bug and provide an estimate for when it can be fixed and released?

        Attachments

        1. listCollectionsGrowth.png
          25 kB
          Michael Lehenbauer

          Issue Links

            Activity

              People

              Assignee:
              ramon.fernandez Ramon Fernandez Marina
              Reporter:
              mikelehen@google.com Michael Lehenbauer
              Participants:
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: