[SERVER-66427] Add test case for massive array info Created: 11/May/22  Updated: 19/Aug/22  Resolved: 19/Aug/22

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Task Priority: Major - P3
Reporter: Ian Boros Assignee: Erin Zhu
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Sprint: QE 2022-07-11, QE 2022-07-25, QE 2022-08-08, QE 2022-08-22
Participants:

 Description   

The columnar format supports having a massive array info, but this path is likely never exercised. We should add a test case for this situation for completeness.



 Comments   
Comment by Githook User [ 19/Aug/22 ]

Author:

{'name': 'Erin Zhu', 'email': 'erin.zhu@mongodb.com', 'username': 'erinzhu001'}

Message: SERVER-66427 Create column store index test for larger arrayInfo strings
Branch: master
https://github.com/mongodb/mongo/commit/bd3a4efde0c07c5760111607d268383a83a6bce4

Comment by Justin Seyster [ 07/Jul/22 ]

Some background for this ticket: the format for how we physically lay out an index entry's arrayInfo varies slightly depending on the size of the arrayInfo. The layout code illustrates how the layout changes and what the size thresholds are for each possible format.
https://github.com/mongodb/mongo/blob/a49498a6bcbcf2fdabd07612e4d8b3d8a87440b4/src/mongo/db/index/column_cell.cpp#L200-L230

Currently, almost none of our testing involves documents that will generate larger arrayInfo strings, which leaves us open to compatibility problems if a change inadvertently modifies the arrayInfo encoding or decoding logic. I suggest a new jstest that creates objects large enough for each of the arrayInfo storage format, inserts them into a collection with a column store index, and then verifies that reads covered by the index produce correct results.

It's possible to create an arrayInfo of a desired size using a pattern like this:

{a: [{b: [1, 2]}, {b: [3, 4]}, {b: [5, 6]}, ...]}

If this example had exactly the three entries in a (i.e., without the "..."), the array info for the a.b would have length 13:

[{[|1]{[|1]{[

Every extra sub-document in the a array adds and extra 5 bytes to the a.b arrayInfo.

In documents with this format, it will always be possible cover queries that project a.b so we can verify that the column scan is correctly reading and decoding the arrayInfo.

Generated at Thu Feb 08 06:05:23 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.