|
As you can see in the previous comment, csi_cold is around 2x faster than collscan_cold, which is an expected good performance improvement.
|
|
In addition, it was uncovered that the range of values for l_dicount is 0.0 to 0.1:
> db.order_lineitem.aggregate(
|
... [
|
... {
|
... $group:
|
... {
|
... _id: "1",
|
... minDisocunt: { $min: "$l_discount" },
|
... maxDisocunt: { $max: "$l_discount" },
|
... minQuantity: { $min: "$l_quantity" }, [^ssb-improved-corrected.yaml] [^ssb-orig-corrected.yaml]
|
... maxQuantity: { $max: "$l_quantity" }
|
... }
|
... }
|
... ]
|
... )
|
{ "_id" : "1", "minDisocunt" : 0, "maxDisocunt" : 0.1, "minQuantity" : 1, "maxQuantity" : 50 }
|
So, the right query (to check the range of 0.01 to 0.03 for l_discount) is:
{
|
"$match": {
|
"order.o_orderdate": {
|
"$gte": ISODate("1992-01-01T00:00:00.000Z"),
|
"$lte": ISODate("1996-01-01T00:00:00Z")
|
},
|
"l_discount": {"$gte": 0.1, "$lte": 0.3},
|
"l_quantity": {"$lte": 25}
|
}
|
},
|
{
|
"$group": {
|
"_id": 1,
|
"revenue": {"$sum": {"$multiply": ["$l_extendedprice", "$l_discount"]}}
|
}
|
}
|
]
|
Here are the corrected query definitions: ssb-orig-corrected.yaml and ssb-improved-corrected.yaml
| name |
csi_cold |
csi_hot |
collscan_cold |
collscan_hot |
| ssb.qWRITING-12145.orig.corrected |
5.996114253997803 |
4.855523586273193 |
13.617784976959229 |
4.068851470947266 |
| ssb.qWRITING-12145.improved.corrected |
6.080265760421753 |
4.894550323486328 |
13.490277767181396 |
3.9808552265167236 |
|
|
Next, I went ahead and changed the way the given query was written to see if it has any impact. The new query is in ssb-improved.yaml :
[
|
{
|
"$match": {
|
"order.o_orderdate": {
|
"$gte": ISODate("1992-01-01T00:00:00.000Z"),
|
"$lte": ISODate("1996-01-01T00:00:00Z")
|
},
|
"l_discount": {"$gte": 0.1, "$lte": 0.3},
|
"l_quantity": {"$lte": 25}
|
}
|
},
|
{
|
"$group": {
|
"_id": 1,
|
"revenue": {"$sum": {"$multiply": ["$l_extendedprice", "$l_discount"]}}
|
}
|
}
|
]
|
However, it didn't have any noticeable impact on the performance:
| name |
csi_cold |
csi_hot |
collscan_cold |
collscan_hot |
| ssb.qWRITING-12145.orig |
8.17105507850647 |
7.048151016235352 |
14.792832851409912 |
5.0408830642700195 |
| ssb.qWRITING-12145.improved |
8.201274156570435 |
6.985919237136841 |
14.691941261291504 |
5.06222128868103 |
|
|
Here is the actual measurement for the query mentioned in the body description (on master@8f2648c3):
| name |
csi_cold |
csi_hot |
collscan_cold |
collscan_hot |
| ssb.qWRITING-12145 |
8.17105507850647 |
7.048151016235352 |
14.792832851409912 |
5.0408830642700195 |
To re-run the experiment:
1- You can compile the code like this:
activate && ./buildscripts/scons.py --ssl -j50 --variables-files=etc/scons/mongodbtoolchain_v3_clang.vars --cache=nolinked --modules="" ICECC=icecc CCACHE=ccache --dbg=off --ninja generate-ninja --link-model=static && ninja -j500 install-devcore
|
2- Run the server:
numactl --interleave=all --physcpubind=4 mongod --wiredTigerCacheSizeGB 16 --oplogSize 100000 --syncdelay 3600 --setParameter ttlMonitorEnabled=false --setParameter diagnosticDataCollectionEnabled=false --setParameter maxIndexBuildMemoryUsageMegabytes=16000 --setParameter featureFlagSbeFull=true --setParameter featureFlagColumnstoreIndexes=true
|
3- Download dump.zip
4- Unzip contents to reveal dump folder
5- Load the data like this:
cd ~/ssb
|
mongorestore --gzip dump
|
cd ~/mongo
|
mongo ssb --eval 'db.order_lineitem.createIndex({"$**": "columnstore"})'
|
6- place ssb-orig.yaml in the test_scripts directory, which contains the query definition:
[
|
{
|
"$match": {
|
"$and": [
|
{"order.o_orderdate": {"$gte": ISODate("1992-01-01T00:00:00.000Z")}},
|
{"order.o_orderdate": {"$lte": ISODate("1996-01-01T00:00:00.000Z")}},
|
{"l_discount": {"$gte": 0.1}},
|
{"l_discount": {"$lte": 0.3}}
|
],
|
"l_quantity": {"$lte": 25}
|
}
|
},
|
{
|
"$group": {
|
"_id": 1,
|
"revenue": {"$sum": {"$multiply": ["$l_extendedprice", "$l_discount"]}}
|
}
|
}
|
]
|
7- place performance.py in the test_scripts directory
8- run the experiment:
(killall mongod || true)
|
cd ~/mongo
|
activate
|
cd test_scripts
|
python3 performance.py /home/ubuntu/mongo/test_scripts/ssb.yaml
|
|
Generated at Thu Feb 08 06:56:15 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.