[SERVER-70260] Some agg expressions crash sbe_expression_bm Created: 05/Oct/22  Updated: 29/Oct/23  Resolved: 19/Oct/22

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 6.2.0-rc0

Type: Bug Priority: Major - P3
Reporter: Kevin Cherkauer Assignee: Ivan Fefer
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File LogicalOrTrue0.stack_dump.ASSERT_REMOVED.json    
Issue Links:
Depends
depends on SERVER-59123 Add benchmarks for frequently used ag... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: QE 2022-10-17, QE 2022-10-31
Participants:

 Description   

See the SBE column entries labeled crash in the following comment of SERVER-59123:

https://jira.mongodb.org/browse/SERVER-59123?focusedCommentId=4863444&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-4863444

All of these benchmarks work fine in Classic (expression_bm executable) but cause crashes in the SBE benchmark tool (sbe_expression_bm executable created by SERVER-69798). I looked at several of them and the proximal cause of the crashes is they trigger the assertion

tassert(6979800, "Unexpected: EvalStage.stage is not null", evalStage.stageIsNull()); 

This assertion is after the check for whether the expression is supported in SBE, so all of these are for expressions that SBE does support.

In one case I tried commenting out the above assert and that led to a crash in SBE itself. Thus I have opened this ticket to debug this problem, and in the meantime I have temporarily not registered any of the crashing expressions with the SBE benchmarking tool in SERVER-59123 so it can be delivered without waiting for the fix to the SBE problem.



 Comments   
Comment by Githook User [ 19/Oct/22 ]

Author:

{'name': 'Ivan Fefer', 'email': 'ivan.fefer@mongodb.com', 'username': 'Fefer-Ivan'}

Message: SERVER-70260 Support stage expression in SBE benchmark
Branch: master
https://github.com/mongodb/mongo/commit/69892e9a4b5c2741f6384722d7402c115a098f74

Comment by Kevin Cherkauer [ 06/Oct/22 ]

The server crashes are specific to the benchmark code not constructing a properly formatted expression for the server to evaluate. Thus it is not a server bug, but a lack of completeness in the implementation of the sbe_expression_benchmark executable binary, which is an external testing tool only. Currently that binary does not submit the malformed expressions to the server, but instead crashes in the sbe_expression_benchmark binary via the assertion

tassert(6979800, "Unexpected: EvalStage.stage is not null", evalStage.stageIsNull());  

Only when I removed this assertion, the malformed expression would then get submitted to the server itself, causing a crash in SBE in the server. But this is due to lack of full expression support in sbe_expression_benchmark. The same statements pasted into mongo shell work fine against a mongod with SBE enabled.

Sorry my explanation of this distinction yesterday was probably not as clear as it could have been. This is really a bug against sbe_expression_bm, not against mongod or SBE.

 

Comment by Ivan Fefer [ 06/Oct/22 ]

Assert that you see fail is still in benchmarks, they are not crashing SBE.

I repost here a comment I left in your PR:

 

The problem is with SBE benchmark code:

https://github.com/mongodb/mongo/blob/master/src/mongo/db/query/sbe_expression_bm.cpp#L96

When SBE expression is parsed there are two cases:

  1. Simple expressions that fit into a single EvalStage has null evalStage and can be executed directly by running evalExpr.
  2. More complex ones (looks like $cond and $ifNull are like this) generate evalStage that should be used instead of evalExpr.

When I was writing this BMs, I had only simple ones, so I didn't handle the second case and just covered it with assertion.

Handling this case may look something like this:

  1. If evalStage.stageIsNull() - execute current implementation.
  2. If not, instead of compiling evalExpr, use evalStage:
    2.1 auto stage = evalStage.extractStage()
    2.2 stage.prepare
    2.3 stage.open
    2.4 For each document, write document to input slot and call stage.getNext()
    2.5 ???
    2.6 PROFIT
Comment by Kyle Suarez [ 06/Oct/22 ]

Clearing the fix version and sending to the triage queue for discussion.

Comment by Kevin Cherkauer [ 05/Oct/22 ]

Moving this to QE Backlog for someone working on SBE to pick up.

Comment by Kevin Cherkauer [ 05/Oct/22 ]

The benchmarks that lead to the crashes will be introduced to the codebase by SERVER-59123.

Comment by Kevin Cherkauer [ 05/Oct/22 ]

To reproduce the crashes, move benchmarks from the BENCHMARK_EXPRESSIONS_CLASSIC_ONLY() macro to the BENCHMARK_EXPRESSIONS() macro below it in expression_bm_fixture.h so they will get registered in the sbe_expression_bm tool, then rebuild server, rebuild and run sbe_expression_bm for those benchmarks.

When I did this for the new benchmark "LogicalOrTrue0" (to be added by SERVER-59123) and also commented out the assertion mentioned above in sbe_expression_bm.cpp, it hit the following assertion in SBE itself:

expression.cpp

uasserted(4946301, str::stream() << "undefined slot accessor:" << slot);

Full stack traceback in attached file LogicalOrTrue0.stack_dump.ASSERT_REMOVED.json.

Comment by Kevin Cherkauer [ 05/Oct/22 ]

arun.banala@mongodb.com ivan.fefer@mongodb.com mihai.andrei@mongodb.com david.storch@mongodb.com FYI. I will take a first look into this. It might need to be reassigned to someone who is working on SBE.

Generated at Thu Feb 08 06:15:42 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.