Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-3177

Massive Performance hit on index creation when doing a db.copyDatabase or a mongorestore on a server with a replicated set configuration

    • Type: Icon: Bug Bug
    • Resolution: Cannot Reproduce
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 1.8.1
    • Environment:
      Ubuntu server 10.04.2 LTS
      12 GB ram
      2 x raid 1 volumes, data on one volume, journal on another.
      2 x 6 core 2.9 GHz cpu with hyperthreading (24 logical cpus)
    • Linux

      When restoring a large collection on a mongodb server with a replicated set configuration, index creation becomes incredibly slow. In our case, the collection index creation could take anywhere from between 20 minutes to over half an hour

      The exact same collection restored to a test mongodb server running inside a virtual machine with much smaller specs, but with no replication sets, the same indexes are created in 5 to 6 seconds.

      Examing the log files, we can see the following:

      // log file from big server, replication sets, secondary server on
      Wed Jun 1 03:37:31 [conn16] building new index on

      Unknown macro: { apps.aid}

      for cassia.snapshots_2011053100_01
      239000/326021 73%
      259900/326021 79%
      268900/326021 82%
      277600/326021 85%
      287400/326021 88%
      289600/326021 88%
      300000/326021 92%
      310000/326021 95%
      311900/326021 95%
      314200/326021 96%
      316500/326021 97%
      318400/326021 97%
      320500/326021 98%
      322600/326021 98%
      324900/326021 99%
      Wed Jun 1 03:40:28 [conn16] done building bottom layer, going to commit
      Wed Jun 1 03:40:37 [conn16] done for 1824 records 185.837secs
      Wed Jun 1 03:40:37 [conn16] insert cassia.system.indexes 185837ms
      Wed Jun 1 03:40:37 [dur] lsn set 79905899
      Wed Jun 1 03:40:37 [conn16] building new index on

      Unknown macro: { cnt}

      for cassia.snapshots_2011053100_02
      Wed Jun 1 03:40:39 [conn16] done for 1823 records 2.376secs
      Wed Jun 1 03:40:39 [conn16] insert cassia.system.indexes 2376ms
      Wed Jun 1 03:40:39 [conn16] building new index on

      Unknown macro: { apps.aid}

      for cassia.snapshots_2011053100_02
      100/1823 5%
      200/1823 10%
      300/1823 16%
      1000/1823 54%
      1100/1823 60%
      1200/1823 65%
      1300/1823 71%
      1800/1823 98%
      Wed Jun 1 03:54:09 [conn16] external sort used : 170 files in 809 secs
      100/325822 0%
      126600/325822 38%
      128200/325822 39%
      129500/325822 39%
      130200/325822 39%
      131300/325822 40%
      131900/325822 40%
      132900/325822 40%
      133600/325822 41%
      134100/325822 41%
      134900/325822 41%
      135900/325822 41%
      136800/325822 41%
      137900/325822 42%
      138800/325822 42%
      139900/325822 42%
      140700/325822 43%
      141400/325822 43%
      141900/325822 43%
      142900/325822 43%
      143800/325822 44%
      144900/325822 44%
      145800/325822 44%
      146700/325822 45%
      148300/325822 45%
      149800/325822 45%
      150200/325822 46%
      305200/325822 93%
      Wed Jun 1 03:59:40 [conn16] done building bottom layer, going to commit
      Wed Jun 1 03:59:40 [conn16] done for 1823 records 1140.97secs
      Wed Jun 1 03:59:40 [conn16] insert cassia.system.indexes 1140969ms
      Wed Jun 1 03:59:40 [conn113] getmore cassia.system.namespaces cid:6678415019539336828 getMore: {} bytes:77862 nreturned:1396 1355345ms
      Wed Jun 1 03:59:40 [conn99] query cassia.snapshots_2011060101_02 ntoreturn:1 reslen:44018 nscanned:1

      Unknown macro: { cnt}

      nreturned:1 1143466ms
      Wed Jun 1 03:59:40 [dur] lsn set 81033909

      The log files from the small server, single mongo, no replicated sets shows the following:

      Wed Jun 1 03:53:35 [conn2] building new index on

      Unknown macro: { apps.aid}

      for cassia.snapshots_2011053100_02
      Wed Jun 1 03:53:41 [conn2] done for 1823 records 5.612secs

      As a test, I shutdown the secondary server while the restore was happening and the index creation time on the primary server when down to a much more acceptable 5 to 6 seconds.

      More datapoints:

      • using the db.currentOp() command shows the index creation as "external sort used" on the primary server with the secondary server active
      • that also appears as "external sort used : 170 files in 809 secs" in the log file
      • after shutting down the secondary server, the "external sort used" goes away
      • "external sorts" never appear in the log files on the same data set in the test database.
      • the slow performance occurs whether I use the mongorestore command or the db.copyDatabase("old collection", "new collection", "old server") command

            Assignee:
            aaron Aaron Staple
            Reporter:
            pdwalker Paul D. Walker
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: