[SERVER-676] use multiple cores for index sort-phase Created: 25/Feb/10 Updated: 07/Dec/23 Resolved: 20/Nov/23 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Index Maintenance |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Minor - P4 |
| Reporter: | Dwight Merriman | Assignee: | Backlog - Storage Execution Team |
| Resolution: | Won't Do | Votes: | 41 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||||||||||
| Assigned Teams: |
Storage Execution
|
||||||||||||||||||||||||||||||||
| Sprint: | Execution EMEA Team 2023-10-02 | ||||||||||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||||||||||
| Case: | (copied to CRM) | ||||||||||||||||||||||||||||||||
| Description |
|
It would be nice if the external sort for creating an index used multiple cores. |
| Comments |
| Comment by Adrien Jarthon [ 07/Dec/23 ] |
|
Thanks! |
| Comment by Steven Vannelli [ 06/Dec/23 ] |
|
Hi bigbourin@gmail.com, you can follow SERVER-83953. Once the work is done, the team will update that ticket. |
| Comment by Adrien Jarthon [ 20/Nov/23 ] |
|
steven.vannelli@mongodb.com thanks for the update, is there any ticket were we can follow this index build parallelism feature? |
| Comment by Steven Vannelli [ 20/Nov/23 ] |
|
We found some small improvements for index builds but our future plans for index builds use parallelism for the entire process and not just for the sorting phase. |
| Comment by Jordi Olivares Provencio [ 13/Oct/23 ] |
|
Putting this back in the backlog as we identified a few ways of improving the throughput of index builds: Both of these tickets combined yielded significant improvements without resorting to complete refactors of the index build architecture. |
| Comment by Connie Chen [ 13/Sep/21 ] |
|
We can consider this during initial sync, repair, or any other operation that is known to be single-threaded. |
| Comment by Piyush Katariya [ 02/Jul/20 ] |
|
This issue need re-consideration. Also in addition it will be helpful if re-building it does not acquire lock on the collection. |
| Comment by oleg gritsak [ 30/Mar/19 ] |
|
So sad to see this feature request in low priority queue for almost a decade.
Speed is the key feature of Mongo for me, and now it is even more important to match opponents. PostgreSQL recently implemented MT-indexing. Oracle can do it for years. Had a task to import 20 billions (20.000 millions) of short documents in Mongo and it failed miserable. Batch insert speed is impressive - more than a million of inserts/sec. But creation of index on 2TB collection is going to last forever... |
| Comment by Roy Reznik [ 07/Nov/16 ] |
|
Thanks Eric, |
| Comment by Eric Milkie [ 07/Nov/16 ] |
|
Roy, I believe that other initial sync improvements will have a bigger impact. Some of these improvements are already implemented for the 3.4 release – we now sort all the index keys for a collection during the data copy phase, for example, which avoids multiple passes through the data. Eventually, I would like to see multiple collections cloning simultaneously, which would permit multiple index builds running on multiple cores. |
| Comment by Roy Reznik [ 06/Nov/16 ] |
|
Not sure why this issue is marked as "minor" - it has a huge impact when doing initial sync, which is very slow and single CPU bound in the index build phase... |