Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-13320

foreground index builds are much slower than background on large collections

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Fixed
    • Affects Version/s: 2.4.4, 2.6.0-rc1
    • Fix Version/s: 2.6.0
    • Component/s: Indexing
    • Labels:
      None
    • Operating System:
      ALL
    • Steps To Reproduce:
      Hide

      Clone https://github.com/nelhage/mongod-tests, and run "index.py". This will start a local mongod and begin inserting records into a test collection, stopping periodically to do both fg and bg index builds and report the timing. (Note that this will use unbounded amounts of storage in /tmp – you can redirect it with TMPDIR=... in the environment).

      Sample output on 2.4.4: https://nelhage.com/paste/2014-03-20EdWTQGGH
      and 2.6.0rc1: https://nelhage.com/paste/2014-03-20JXX2L3HT

      Show
      Clone https://github.com/nelhage/mongod-tests , and run "index.py". This will start a local mongod and begin inserting records into a test collection, stopping periodically to do both fg and bg index builds and report the timing. (Note that this will use unbounded amounts of storage in /tmp – you can redirect it with TMPDIR=... in the environment). Sample output on 2.4.4: https://nelhage.com/paste/2014-03-20EdWTQGGH and 2.6.0rc1: https://nelhage.com/paste/2014-03-20JXX2L3HT

      Description

      Ostensibly, a main reason for separate foreground/background builds is that foreground index builds are faster, at the cost of blocking the server [1]. However, we have observed both in production and in synthetic benchmarks, that foreground index builds are often much slower on large collections.

      In a synthetic benchmark, building a trivial index on a collection with about 6M records, each about 1k large, I saw the following numbers:

      [2014-03-21 00:05:10,042 22310|INFO] Done. items=6250000 fg=145.1 bg=94.7
      [2014-03-21 00:09:01,545 22310|INFO] Done. items=6250000 fg=135.7 bg=95.4
      [2014-03-21 00:12:43,822 22310|INFO] Done. items=6250000 fg=125.1 bg=96.9
      [2014-03-21 00:16:25,450 22310|INFO] Done. items=6250000 fg=125.6 bg=95.8
      [2014-03-21 00:20:00,567 22310|INFO] Done. items=6250000 fg=122.5 bg=92.3

      The "fg=" number is seconds to build an index in the foreground, and "bg=" is for a background build. The requested indexes are identical, and are dropped each time.

      I don't have hard numbers right now, but experience in production suggests that the difference only gets worse as the collection gets even bigger.

      [1] http://docs.mongodb.org/manual/tutorial/build-indexes-in-the-background/

        Attachments

          Activity

            People

            • Votes:
              0 Vote for this issue
              Watchers:
              20 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: