-
Type:
Improvement
-
Resolution: Unresolved
-
Priority:
Minor - P4
-
None
-
Affects Version/s: None
-
Component/s: None
-
None
-
Query Optimization
-
None
-
None
-
None
-
None
-
None
-
None
-
None
The values in correlated columns are not merely correlated, they are completely identical if they are generated the same way.
To reproduce:
Use this spec:
@dataclasses.dataclass class C: f1: Specification(source=correlation(lambda: global_faker().pyint(max_value=10), "a")) f2: Specification(source=correlation(lambda: global_faker().pyint(max_value=10), "a"))
and run:
python src/mongo/db/query/benchmark/data_generator/driver.py --uri "mongodb://localhost:20000/" --db test --size 100 --seed 1 --serial-inserts --drop --analyze specs.test_correlations C
and you will get:
Enterprise test> db.C.find();
[
{ _id: ObjectId('690db5c61189c0dc6de2a7ab'), f1: 5, f2: 5 },
{ _id: ObjectId('690db5c61189c0dc6de2a7ac'), f1: 1, f2: 1 },
{ _id: ObjectId('690db5c61189c0dc6de2a7ad'), f1: 3, f2: 3 },
{ _id: ObjectId('690db5c61189c0dc6de2a7ae'), f1: 0, f2: 0 },
{ _id: ObjectId('690db5c61189c0dc6de2a7af'), f1: 10, f2: 10 },
{ _id: ObjectId('690db5c61189c0dc6de2a7b0'), f1: 7, f2: 7 },
{ _id: ObjectId('690db5c61189c0dc6de2a7b1'), f1: 0, f2: 0 },
{ _id: ObjectId('690db5c61189c0dc6de2a7b2'), f1: 5, f2: 5 },
{ _id: ObjectId('690db5c61189c0dc6de2a7b3'), f1: 9, f2: 9 },
{ _id: ObjectId('690db5c61189c0dc6de2a7b4'), f1: 7, f2: 7 },
{ _id: ObjectId('690db5c61189c0dc6de2a7b5'), f1: 5, f2: 5 },
{ _id: ObjectId('690db5c61189c0dc6de2a7b6'), f1: 9, f2: 9 },
{ _id: ObjectId('690db5c61189c0dc6de2a7b7'), f1: 6, f2: 6 },
{ _id: ObjectId('690db5c61189c0dc6de2a7b8'), f1: 7, f2: 7 },
{ _id: ObjectId('690db5c61189c0dc6de2a7b9'), f1: 0, f2: 0 },
{ _id: ObjectId('690db5c61189c0dc6de2a7ba'), f1: 9, f2: 9 },
{ _id: ObjectId('690db5c61189c0dc6de2a7bb'), f1: 3, f2: 3 },
{ _id: ObjectId('690db5c61189c0dc6de2a7bc'), f1: 1, f2: 1 },
{ _id: ObjectId('690db5c61189c0dc6de2a7bd'), f1: 3, f2: 3 },
{ _id: ObjectId('690db5c61189c0dc6de2a7be'), f1: 5, f2: 5 }
]
as you can see, f1 and f2 have identical values throughout.
timour.katchaounov@mongodb.com can you please opine as to how important that its with respect to CBR. We should definitely strive to fix this before any join ordering testing.