[DOCS-9994] Unique field must be included to ensure sort order stability across multiple queries Created: 10/Mar/17  Updated: 30/Oct/23  Due: 04/Dec/20  Resolved: 07/Dec/20

Status: Closed
Project: Documentation
Component/s: manual, Server
Affects Version/s: None
Fix Version/s: Server_Docs_20231030

Type: Improvement Priority: Major - P3
Reporter: Kelsey Schubert Assignee: Andrew Feierabend (Inactive)
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Documented
documents SERVER-28195 $skip followed by $limit in aggregati... Closed
Gantt End to End
Related
related to SERVER-51498 Sorting on duplicate values causes re... Closed
related to SERVER-81571 Reconsider stable sort in sorter.cpp Blocked
Participants:
Days since reply: 3 years, 8 weeks, 2 days ago
Epic Link: DOCSP-11701
Story Points: 4

 Description   

There are number of reasons that the ordering of results with the same sort key might not be stable:

  1. The query execution stages for performing blocking sort operations in find/aggregation do not use a stable sort algorithm.
  2. Unpredictability of network latency and getMore ordering for queries on sharded clusters may result in different orders. This is similar to the behavior described in SERVER-27716, which was closed as Works as Designed.
  3. Changes to the underlying storage. For example, document moves on the MMAPv1 storage engine can result in repeated COLLSCANs returning the data in different orders. Therefore, even if the blocking sort query execution algorithm was stable, this still wouldn't be sufficient to guarantee consistent ordering of equal-keyed results.

Therefore, we should clarify that users who require a stable sort order across queries (eg pagination) add a unique field to their sort.



 Comments   
Comment by Githook User [ 14/Dec/20 ]

Author:

{'name': 'Andrew Feierabend', 'email': 'andrew.feierabend@mongodb.com', 'username': 'andf-mongodb'}

Message: DOCSP-13532 extend DOCS-9994 to cursor.limit() and cursor.skip()
Branch: master
https://github.com/mongodb/docs/commit/5e75d59541afe0248ebd5675f15a85e5b5f876c6

Comment by Githook User [ 14/Dec/20 ]

Author:

{'name': 'Andrew Feierabend', 'email': 'andrew.feierabend@mongodb.com', 'username': 'andf-mongodb'}

Message: DOCSP-13532 extend DOCS-9994 to cursor.limit() and cursor.skip()
Branch: v5.0
https://github.com/mongodb/docs/commit/1cfaf6cbbb9a7ab363796d6549fd3811790fa142

Comment by Githook User [ 07/Dec/20 ]

Author:

{'name': 'Andrew Feierabend', 'email': 'andrew.feierabend@mongodb.com', 'username': 'andf-mongodb'}

Message: DOCS-9994,DOCSP-13487 document sort stability change in v4.4
Branch: master
https://github.com/mongodb/docs/commit/c7567f88df1f6ba193d2f9ed7c9ffbe034073fdf

Comment by Githook User [ 07/Dec/20 ]

Author:

{'name': 'Andrew Feierabend', 'email': 'andrew.feierabend@mongodb.com', 'username': 'andf-mongodb'}

Message: DOCS-9994,DOCSP-13487 document sort stability change in v4.4
Branch: v5.0
https://github.com/mongodb/docs/commit/0a5899664db191a2da72a1a23a963f780ee115d9

Comment by David Storch [ 06/Nov/18 ]

ravind.kumar, it looks like your questions have been answered, but happy to help if you need any more clarification. The bottom line is that any query which requests a sort in MQL (find command sort, the $sort stage, sort for findAndModify) does not guarantee stability within the set of documents that have the same sort key. Assuming you have (document, sort key) pairs (d1, 2), (d2, 2), (d3, 1), (d4, 8), it is guaranteed that the order for an ascending sort is either d3, d1, d2, d4 or d3, d2, d1, d4. Since d1 and d2 are equal, they may come back in either order. Applications that require an exact ordering must ensure that no two documents have the same sort key.

Comment by Kelsey Schubert [ 02/Nov/18 ]

is this also true for WiredTiger?

No

given that sorts can take advantage of indexes, is the implication here something like a compound unique index on the sort field + one additional field would be the ideal support for a consistently ordered sort?

yes, there should be a compound index of the desired sort + one unique field at the end, and then the user should sort against this entire pattern. E.g. if they wanted to consistent sort on {a:1, b:1}, they could created an index and sort on {a:1, b:1, _id: 1}.

Comment by Chris Harris [ 02/Nov/18 ]

Regarding the first question, a non-blocking sort uses an index. Per "Use Indexes to Sort Query Results":

In a blocking SORT, all input must be consumed by the sort step before it can produce output. In a non-blocking, or indexed sort, the sort step scans the index to produce results in the requested order.

It was also the topic of this Google Group discussion a few years ago.

Comment by Ravind Kumar (Inactive) [ 02/Nov/18 ]

Taking this on from Andrew. A few questions:

  1. The query execution stages for performing blocking sort operations in find/aggregation do not use a stable sort algorithm.

What defines 'blocking sort operations' ? We document $sort for agg and cursor.sort() for queries - are those always blocking, or are there specific scenarios in which they block? cc david.storch

Changes to the underlying storage. For example, document moves on the MMAPv1 storage engine can result in repeated COLLSCANs returning the data in different orders. Therefore, even if the blocking sort query execution algorithm was stable, this still wouldn't be sufficient to guarantee consistent ordering of equal-keyed results.

is this also true for WiredTiger? cc alexander.gorrod@mongodb.com

Therefore, we should clarify that users who require a stable sort order across queries (eg pagination) add a unique field to their sort.

given that sorts can take advantage of indexes, is the implication here something like a compound unique index on the sort field + one additional field would be the ideal support for a consistently ordered sort?

Comment by Andrew Aldridge [ 24/Aug/18 ]

I will take a look today!

Generated at Thu Feb 08 07:59:35 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.