[SERVER-72236] Generate random integer data for CE Created: 19/Dec/22  Updated: 29/Oct/23  Resolved: 12/Jan/23

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 6.3.0-rc0

Type: Task Priority: Major - P3
Reporter: Timour Katchaounov Assignee: Timour Katchaounov
Resolution: Fixed Votes: 0
Labels: M7
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Backwards Compatibility: Fully Compatible
Sprint: QO 2022-12-26, QO 2023-01-09, QO 2023-01-23
Participants:

 Comments   
Comment by Githook User [ 12/Jan/23 ]

Author:

{'name': 'Timour Katchaounov', 'email': 'timour.katchaounov@mongodb.com', 'username': 'timourk'}

Message: SERVER-72236 Generate random integer data for CE

Address final review comments.
Branch: master
https://github.com/mongodb/mongo/commit/550e070c93e80bdb4f256dd634128ad919395eb0

Comment by Githook User [ 10/Jan/23 ]

Author:

{'name': 'Timour Katchaounov', 'email': 'timour.katchaounov@mongodb.com', 'username': 'timourk'}

Message: SERVER-72236 Generate random integer data for CE

Generate random data with integers. The approach is as follows:

  • There is one collection for each different cardinality. All collections contain the same fields.
  • Each field contains the data generated from a certain data distribution. The data could be anything - same type, mixed types, same mathematical distribution (e.g. normal), or a mixed distribution.
  • The committed configuration file, and the corresponding data file are reduced to only two small collections. For actual experiments one needs to add more data sizes, and re-generate the data locally. This is done so that Evergreen tests can run fast, and to reduce the size of the git repository.
  • All data is saved in a single JavaScript file: jstests/query_golden/libs/data/ce_accuracy_test.data, with a corresponding schema file jstests/query_golden/libs/data/ce_accuracy_test.schema.
  • The data file is a JavaScript file that can be loaded directly inside a JS test. When loading this file, it creates a global variable dataSet. The reason is that this is the only way to load an external JSON file that doesn't need to install external tools in Evergreen.
    Branch: master
    https://github.com/mongodb/mongo/commit/3ade0b9fe31cb2abbe0781eb5eb99ca3a095dc32
Generated at Thu Feb 08 06:21:12 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.