[SERVER-12579] blocking sort's memory accounting is wrong Created: 03/Feb/14  Updated: 06/Dec/22

Status: Backlog
Project: Core Server
Component/s: Querying
Affects Version/s: 2.5.5
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: hari.khalsa@10gen.com Assignee: Backlog - Query Execution
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Assigned Teams:
Query Execution
Participants:

 Description   

We keep count of the memory usage while executing a blocking sort and the stage will (purposefully) kill itself if we use too much. Problem is, the accounting we're using isn't quite right. Most of the data going into sort isn't actually an owned object; the BSONObj is almost always an unowned object pointing into the memory-mapped collection. As such the only overhead incurred is that of the various query-specific wrappers around it, and the actual pointer to the on-disk mmap'd data.

Anyway, unless we have lots of document invalidations, I don't think we will actually be using a lot of memory when we sort; we just think we are because we account for the size of the on-disk item. We could change the accounting and greatly up the limit for how many things we will in-memory sort.

If we do this, do we want the cut-off to be memory usage based, or # of docs based? If the former we could probably have a *lot* more documents in a blocking sort. The latter is a departure from previous behavior but could preserve the same effective behavior (a lower limit).



 Comments   
Comment by Asya Kamsky [ 08/Feb/18 ]

Is this relevant to WT or only MMAPV1?

Comment by J Rassi [ 18/Feb/15 ]

Andy: the old version of the modified document is included in the query results, not the new version. Before the update is made public, the update subsystem will find the active query and call SortStage::invalidate() (by way of CursorManager::invalidateDocument()). SortStage::invalidate() fetches the document (which hasn't changed yet), and when query execution resumes the sort stage will return the pre-fetched document without the associated diskloc.

Comment by Andy Schwerin [ 13/Feb/14 ]

This prompted me to read exec/sort.cpp. This in turn led me to wonder what
happens when a document changes in place in such a way that the value of
the sort key changes during execution of the blocking sort. I do not envy
you this task.

-Andy

Generated at Thu Feb 08 03:28:55 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.