[DOCS-14071] Remove all references to "Indexes should fit in RAM" and similar variants Created: 23/Dec/20 Updated: 22/Jan/24 Due: 25/Dec/20 |
|
| Status: | Backlog |
| Project: | Documentation |
| Component/s: | manual, Server |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Task | Priority: | Major - P3 |
| Reporter: | Shakir Sadikali | Assignee: | Unassigned |
| Resolution: | Unresolved | Votes: | 7 |
| Labels: | backlog, proactive, query | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||
| Participants: | |||||
| Days since reply: | 2 years, 51 weeks, 5 days ago | ||||
| Epic Link: | DOCSP-11702 | ||||
| Story Points: | 2 | ||||
| Description |
DescriptionWe have numerous references in our documentation where we advise customers that "indexes should fit in RAM" for optimal performance. For example: Notice we contradict the entire point of the page with the and https://docs.mongodb.com/manual/applications/indexes/ There are a few problems with making these statements. While it's true that having indexes in RAM does make it so that you don't have to read them from disk THIS IS TRUE FOR ALL DATA USED BY THE APPLICATION. There's nothing special about having "just" indexes in RAM. The more of your "entire database" you have in RAM the better the performance will be. The working set is composed of both data and index pages and MongoDB offers no mechanism to pin specific data in cache. When a query executes it must sequentially read the relevant index keys by pulling their pages into cache and then pull the relevant document pages. This means that the working set needs to have both index pages and data pages for the most accessed data. We don't store an entire index in RAM... that would be a massive waste of RAM if only a small portion of the index is actually used for data access. This advice ignores how WiredTiger works and gives customers flawed guidance. When customers read these statements they inevitably open support cases where we need to walk them away from what our documentation says because the issue is, ultimately, more nuanced. An alternative approach would emphasize the need to manage the working set, describing how indexes and data access occurs in a running query, and how making queries efficient makes performance optimal. I'm happy to assist if we'd like to alter the approach we're taking now. Scope of changesImpact to Other DocsMVP (Work and Date)Resources (Scope or Design Docs, Invision, etc.) |
| Comments |
| Comment by Asya Kamsky [ 12/Feb/21 ] |
|
I’m more concerned that we say full index should fit in RAM when right balanced indexes perform well with just the hot part of the index in RAM |