[SERVER-59889] Add compatibility tests for $group pushdown behavior Created: 10/Sep/21 Updated: 18/Jan/22 Resolved: 18/Jan/22 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Task | Priority: | Major - P3 |
| Reporter: | Eric Cox (Inactive) | Assignee: | Yoon Soo Kim |
| Resolution: | Won't Fix | Votes: | 0 |
| Labels: | sbe | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Sprint: | QE 2021-09-20 | ||||||||
| Participants: | |||||||||
| Description |
|
This work adds a JS test suite that compares the behavior between execution of $group in the classic engine vs $group pushdown into sbe. We could consider a randomized testing approach for data generation, not for queries. Randomized queries makes it too hard to investigate a failure. The main goal here is to expose incompatible behaviors between the behaviors of $group in classic engine and ones of $group in SBE as soon as possible and make them easy to investigate. |
| Comments |
| Comment by Yoon Soo Kim [ 18/Jan/22 ] |
|
I'd rather wait for how the test working group discussion goes. I believe that this ticket is closely related to the discussion because what I was trying to do was to define a combinatorial test framework and on top of that, define test matrix for $group. Will resolve this as "Won't Fix". But please, feel free to reopen this ticket when the necessity arises. |
| Comment by Yoon Soo Kim [ 22/Sep/21 ] |
|
Sure ethan.zhang, will add it. |
| Comment by Ethan Zhang (Inactive) [ 22/Sep/21 ] |
Should we add "undefined" to this as well? |
| Comment by Ethan Zhang (Inactive) [ 21/Sep/21 ] |
|
Just discussed this in standup. The combinations are deterministic, but the test data is not. Yoonsoo will make changes to make it deterministic. |
| Comment by Yoon Soo Kim [ 21/Sep/21 ] |
|
Sure, steve.la. I think basically there are 3 groups of tests for $group.
Yes, existing $group test suites eventually validate expected results for test cases that we are explicitly testing if we run those tests with SBE turned on. The questions are whether they are enough and it would be easy to investigate test failures. I think we have the same kind of test suites for find() but AFAIK, we've had a long tail of correctness issues for SBE find() many of which have been found by fuzzer tests. Failures with the fuzzer test are notorious for being hard to investigate among query team. I believe that the one reason why we have a long tail for SBE correctness for find() is that fuzzer tests are inherently randomized tests for queries. So, it may take a long time to reveal all possible correctness issues and also it's very time-consuming to make a minimal repro out of a generated failed fuzzer test file, which is a starting point of fixing an issue. This ticket is an attempt to expose such incompatibilities as early as possible by generating full $group spec combinations out of various test dimensions and running those $group specs on both the classic engine and the SBE engine and compare the SBE engine result to the classic engine result. This ticket is also an attempt to help engineers to investigate test failures by attaching a repro script for each test failure. Usually a js test stops running as soon as the first failed test case is encountered but this test suite will keep running until all test combinations of $group spec are completed and report all failed test cases with repro scripts at the end of execution. While I have been working on this ticket, I found two incompatibilities though it's still in the early stage of development. One is related to an error code for a case that a numeric expression like $add fails with non-numeric value at runtime and the other is $sum result for SBE is different from one for classic engine for a certain input data. I think the first one is ok to include to known incompatibilities and the second one is worth investigating further. I'm not sure whether this explanation makes sense to you and you think this test suite would be doable and useful. Please, let me know. If we think that existing $group test suites and agg fuzzer tests are enough, I can move on to work on other items like improving $group performance. |
| Comment by Yoon Soo Kim [ 16/Sep/21 ] |
|
Found a bug. |
| Comment by Yoon Soo Kim [ 15/Sep/21 ] |
|
Test dimensions to consider are:
For dimension 1 & 2 & 3, we could generate $group stages. This list should cover as many combinations as possible, avoiding redundant test cases. For 4 & 5 & 6, we can leverage random data generation and still full combinations should be covered. In the context of this test suite,
|