[SERVER-81571] Reconsider stable sort in sorter.cpp Created: 29/Sep/23  Updated: 06/Feb/24

Status: Blocked
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Jordi Olivares Provencio Assignee: Brad Cater
Resolution: Unresolved Votes: 0
Labels: former-storex-namer, storex-ranked
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
depends on SERVER-85337 $bucketAuto $first does not obey inpu... In Code Review
depends on SERVER-85424 $group relies on stable sorting Closed
depends on SERVER-85510 Fix checkpoint jstests to not depend ... Closed
Related
related to SERVER-84523 Reconsider deque in sorter.cpp Closed
is related to SERVER-85337 $bucketAuto $first does not obey inpu... In Code Review
is related to DOCS-9994 Unique field must be included to ensu... Closed
is related to SERVER-676 use multiple cores for index sort-phase Closed
Assigned Teams:
Storage Execution
Sprint: Execution Team 2024-01-08, Execution Team 2024-01-22, Execution Team 2024-02-05, Repl 2024-02-19
Participants:

 Description   

During analysis of SERVER-676 we identified that stable_sort as used in the sorter can be replaced with a normal sort. This would relax requirements imposed on the sorting and use a potentially different algorithm.

Additionally, we found that the NoLimit sorter uses a deque instead of a normal vector with a reserved capacity that would offer better memory locality.

A rough patch that implemented the changes mentioned here yielded a very significant improvement in index builds.



 Comments   
Comment by Jordi Olivares Provencio [ 13/Oct/23 ]

Note that the historic reason for choosing deque goes back to pre-C++11 days. In that time there was no move optimization present and the BSONObj class used in some values would have to perform a lot of refcount inc/dec when doing vector growth due to the copies.

This shouldn't be an issue nowadays as all users of the NoLimit sorter can use the move optimisation since they have both destructor and move construct as noexcept.

Note also that since DOCS-9994 we've been communicating to users that they should use a unique field for predictable sort order. Thus we should be free from having to use stable_sort.

Generated at Thu Feb 08 06:46:51 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.