[SERVER-5712] separate files for data and index Created: 26/Apr/12  Updated: 23/Jul/14  Resolved: 26/Apr/12

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: New Feature Priority: Minor - P4
Reporter: Neil Sanchala Assignee: Unassigned
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
duplicates SERVER-965 Store the indexes of a collection on ... Closed
Participants:

 Description   

Would it be feasible for mongo to be slightly more granular about its data files, and separate data from index? For example, right now, everything for the "foursquare" database is stored in:

  • foursquare.N where N is a number

Instead, the data and index could be separated into:

  • foursquare.data.N
  • foursquare.index.N

There are a couple advantages to this separation:

1. We would be able to use vmtouch or other tools to pin the index into memory. Given the choice between page faults in index and page faults of data, I'd much rather have the reads of data fault. The kernel is somewhat good about putting the right stuff into page cache, but at cold startup or when not all data fits in RAM it would be helpful to give it a hint about what should be warm.

2. It would be easier to see how much index and how much data was cached. There exists tools to see how much of a file is in the page cache, but those aren't helpful in the current setup since we don't get an index/data breakdown.

Generally, I'd feel much more confident with the operations of a disk-reading mongo if we had a bit more control over what was read from disk and what was always in memory.

(Even better would be to have different data/index files at the per-collection level, but I could see how that would be difficult in degenerate cases with thousands of collections.)



 Comments   
Comment by Eliot Horowitz (Inactive) [ 26/Apr/12 ]

Not yet - but its something we want to do: SERVER-965

Generated at Thu Feb 08 03:09:40 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.