[SERVER-71564] _writeTestPipeBsonFile() test shell extension for Named Pipes benchmarks Created: 22/Nov/22 Updated: 29/Oct/23 Resolved: 30/Nov/22 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | 6.3.0-rc0 |
| Type: | Task | Priority: | Major - P3 |
| Reporter: | Kevin Cherkauer | Assignee: | Kevin Cherkauer |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||
| Backwards Compatibility: | Fully Compatible | ||||
| Participants: | |||||
| Description |
|
These are subtasks from PERF-3313:
|
| Comments |
| Comment by Githook User [ 30/Nov/22 ] |
|
Author: {'name': 'Kevin Cherkauer', 'email': 'kevin.cherkauer@mongodb.com', 'username': 'kevin-cherkauer'}Message: |
| Comment by Kevin Cherkauer [ 28/Nov/22 ] |
|
Implementation:
This will round-robin the objects into the pipe until the number of objects written reaches the requested count. This approach allows us to feed BSON files into named pipes very fast, since the pipe writing is done from memory images, and for the parent PERF-3313 we do want the writer (producer) to be as fast as possible so it is not the bottleneck in the benchmarks. Example: _writeTestPIpeBsonFile("pipeName", 100000, "objectsFile.bson"); |
| Comment by Kevin Cherkauer [ 23/Nov/22 ] |
|
I have created a set of compressed benchmark data files of different orders of magnitude in both BSON and JSON formats plus uploaded the original Queries.zip and added a README.txt. These files (16 total) are in the Query Team Google Drive directory https://s3-us-west-2.amazonaws.com/dsi-donot-remove/Query/Benchmarks/ClickBench/ by Ryan Timmons of the Performance Tools team. The subsets contain the first N objects from the full ClickBench dataset, for N in {1, 10, 100, 1,000, 10,000, 100,000, 1,000,000}. The original full dataset has almost 100 million objects and is unwieldy at 22.1 GB compressed JSON or 216.7 GB uncompressed. We do not need such a huge dataset for our benchmarks. These seven subsets will let us easily pick the scale of data we want to run any given benchmark on. The largest subset is about 2.2 GB of BSON when uncompressed or 162 MB when gzipped. |