[SERVER-965] Store the indexes of a collection on another partition/drive (for example a SSD) than the documents Created: 05/Apr/10  Updated: 27/Oct/15  Resolved: 02/Dec/14

Status: Closed
Project: Core Server
Component/s: Index Maintenance, Performance, Storage
Affects Version/s: None
Fix Version/s: 2.8.0-rc2

Type: New Feature Priority: Major - P3
Reporter: Loïc d'Anterroches Assignee: Sam Kleinman (Inactive)
Resolution: Done Votes: 31
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Documented
is documented by DOCS-4462 Document storage of indexes in a sepa... Closed
Duplicate
is duplicated by SERVER-8179 Allow indexes to live in a separate f... Closed
is duplicated by SERVER-5712 separate files for data and index Closed
is duplicated by SERVER-14478 Separate disks for indexes Closed
Related
related to SERVER-16567 extend data directory metadata to hol... Closed
is related to SERVER-2875 Multiple dbpath, and by object assign... Closed
Participants:

 Description   

When we have very large collections, the ability to get a server with enough ram is hard, but it is easy to add a SSD drive in the box. A good solution to improve performance would be to be able to store the indexes on a given partition on the SSD drive and the dataset on a conventional drive.



 Comments   
Comment by Ramon Fernandez Marina [ 14/Apr/15 ]

michaelbrenden, can you please open a new ticket about this last issue and provide the server logs that you're getting?

Thanks,
Ramón.

Comment by Michael Brenden [ 14/Apr/15 ]

IF the /mongo/db/website1/index link points to a non-existent place, entire mongo server core dumps.

Comment by Michael Brenden [ 14/Apr/15 ]

Restarting with a fresh / empty data dir worked. The docs should be noted.

Comment by Alexander Gorrod [ 13/Apr/15 ]

Are you starting with a clean data directory each time you start mongod with the different settings?

Comment by Michael Brenden [ 13/Apr/15 ]

this does not work for me 3.0.1 precompiled on debian 7.8 wheezy

storage:
engine: wiredTiger
wiredTiger:
engineConfig:
directoryForIndexes: true

is it possibly conflicting with 'directory per db' setting?

Comment by Ofer Cohen [ 10/Apr/15 ]

This is without YAML with directoryperdb on

storageEngine=wiredTiger
directoryperdb=1

Comment by Ramon Fernandez Marina [ 10/Apr/15 ]

This works for me with 3.0.2:

storage:
  engine: wiredTiger
  wiredTiger:
    engineConfig:
      directoryForIndexes: true

Can you please give this a try and report back?

Thanks,
Ramón.

Comment by Michael Brenden [ 10/Apr/15 ]

no combination or arrangement of mongodb.conf or of cmd line parms I've tried works. This error is returned most frequently:
"I STORAGE [initandlisten] exception in initAndListen: 72 Metadata contains unexpected value storage engine option for directoryForIndexesExpected true but got falseinstead, terminating"

Here is my mongodb.con in YAML:
storage:
..engine: "wiredTiger"
..dbPath: "/data/wt"
..directoryPerDB: true
..journal:
....enabled: true
..wiredTiger:
....engineConfig:
......directoryForIndexes: true

Comment by Michael Brenden [ 10/Apr/15 ]

what is the syntax for use in /etc/mongodb.conf ?

wiredTigerDirectoryForIndexes = true – fail

wiredTigerDirectoryForIndexes = 1 – fail

wiredTigerDirectoryForIndexes=true – fail

wiredTigerDirectoryForIndexes=1 – fail

wiredTigerDirectoryForIndexes="/data/blah/" – fail

wiredTigerDirectoryForIndexes = "/data/blah/" – fail
storage:
engine: "wiredTiger"
directoryForIndexes: "/data/blah/" – fail

starting with
/usr/local/mongo/bin/mongod -f /etc/mongodb.conf --wiredTigerDirectoryForIndexes — fail

This is one thing about mongo all along, the docu for config file is positively shitty, because it's non-existent or lagging at best. And now there's some goofy-assed YAML thing to screw stuff up even more.

...not to complain, but that's on top of the scons fiasco that never "just works", and the fact that 3.0.0 will not compile on stock debian wheezy 7.8, making mongo incompatible with widest, broadest, oldest, most stable linux distro – what is with this shoot-self-in-foot path?

Comment by Daniel Pasette (Inactive) [ 04/Dec/14 ]

michaelbrenden, to be clear, this feature is only available when using the WiredTiger storage engine.

To use, invoke mongod like so:
mongod --dbpath /tmp --storageEngine wiredTiger --wiredTigerDirectoryForIndexes

In your dbpath, you will see a directory called indexes where indexes will be stored.

Comment by Michael Brenden [ 02/Dec/14 ]

We looked at flashcache and bcache and decided to wait. ...meanwhile, Mongo comes roaring in with a proper application-level fix ! Very cool because we have the servers already waiting, loaded up with SSDs ready to hold mongo indexes.

Comment by Eliot Horowitz (Inactive) [ 02/Dec/14 ]

https://github.com/mongodb/mongo/commit/7e4de6184d876b7963946708d6e83ee57335211f

Comment by Ofer Cohen [ 25/Mar/14 ]

This is just whopping important. We're looking into flashcache and bcache on kernel 3.2+

Can you please elaborate? Did you try one of the flashcache or bcache?

Comment by Michael Brenden [ 22/Mar/13 ]

This is just whopping important. We're looking into flashcache and bcache on kernel 3.2+

Comment by Glenn Maynard [ 29/Mar/12 ]

Loic: That's a lot of work to do correctly by hand, since your manual updates to the data and indexes won't be atomic, and if your server crashes the data will probably roll back to different points, so you'll have to deal with manually fixing it up as well.

I'm surprised this hasn't received any attention; it seems like a pretty huge optimization.

Comment by Loïc d'Anterroches [ 06/Apr/10 ]

Poor man solution for the moment:

  • create two databases, one with the bulk of the data, one with a collection containing the strict minimum for your indexes.
  • use the first database as "dumb store".
  • load the data in both databases.
  • move + symlink the "index" database on the ssd drive.

That way you keep one mongod running which is I suppose a bit better for memory management than 2. You just need to remember moving/symlinking the indexes if a new "index" file is created.

This means that you will need to "merge" your data at the application level but depending of your case, this can boost your performances.

Generated at Thu Feb 08 02:55:40 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.