[SERVER-59960] Ensure CompileCtx::root is set before accessing it Created: 15/Sep/21  Updated: 29/Oct/23  Resolved: 13/Jan/22

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 5.3.0

Type: Improvement Priority: Minor - P4
Reporter: Anton Korshunov Assignee: Anna Wawrzyniak
Resolution: Fixed Votes: 0
Labels: neweng
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
is related to SERVER-62495 Ensure that CompileCtx::accumulator i... Closed
Backwards Compatibility: Fully Compatible
Sprint: QE 2021-12-27, QE 2022-01-10, QE 2022-01-24
Participants:

 Description   

Before compiling an sbe::EExpression the caller is responsible for setting the root PlanStage on the CompileCtx. However, this is not enforced in any way. If the caller forgets to set it, this may result in a segfault. At the very least we should have an invariant that the root is set before dereferencing it, for example here.



 Comments   
Comment by Githook User [ 12/Jan/22 ]

Author:

{'name': 'Anna Wawrzyniak', 'email': 'anna.wawrzyniak@mongodb.com', 'username': 'anna-wawrzyniak'}

Message: SERVER-59960 Make ContextCtx::root optional
Branch: master
https://github.com/mongodb/mongo/commit/5fd3e8b800b43c25d92530fc1b3e0b9bbec7bf6b

Comment by Anna Wawrzyniak [ 10/Jan/22 ]

all right, sounds like the code paths that that uses ctx.root and one that doesnt are separate.

Under this assumption the code path that does not use ctx.root, will never use a CompileCtx that had mutated ctx.root, so just modifying EVariable to handle the optional ctx.root seems sufficient.

Comment by Anna Wawrzyniak [ 10/Jan/22 ]

After talking to Anton, I think I have better idea what’s happening there.

Seems like EExpression::compile needs ability to resolve variables, and variables can be introduced by mulpiple scopes (env, context, or stage). So when compiling expression outside of a scope of a stage it would make sense to only have (env and context) as scopes and make the stage scope optional.

Additionally, code wise, stage scope seems to be introduced by the pattern like: https://github.com/10gen/mongo/blob/c8fdce8af95e50b18d1cc821c37ec46adb24851a/src/mongo/db/exec/sbe/stages/project.cpp#L60

 

ctx.root = this;
auto code = expr->compile(ctx);

where the intent seems to be that a caller wants to invoke compile method with “this” stage in scope as well.

 

Lets consider case like this:

FooStage::prepare() {
  ...  
ctx.root = this;
auto code = expr->compile(ctx);
// ctx.root remains == this
...
}
 
BarNotAStage::prepare() {
  // Unless ctx.root is set to null right here, the ctx.root may be undefined, as there can be any lingering value from previous compilation, like FooStage::prepare(). 
  auto code = expr->compile(ctx); 
}

So few options I can think of here:

1) We can require that non-stage compilation, should do ctx.root = null before calling expr->compile(ctx)
This is awkward, as it puts the burden on the caller to null things it doesnt care about.  In stage cases, setting ctx.root is ok, as they actually care about that. A lingering ctx.root set to stage after compilation, could also be a catalyst of a segfault (if FooStage gets destroyed, and ctx.root gets accidentally used).

2) Make caller responsible for clearing the ctx after its being used, like:

FooStage::prepare() {
 ...
ctx.root = this; 
auto code = expr->compile(ctx);
ctx.root = null; // reset the ctx.root back to null, to not store the ptr for any longer than really needed
}
 
BarNotAStage::prepare() {
 // ctx.root is expected here to be null. We could add assert(ctx.root ==null) if we are paranoid.
 auto code = expr->compile(ctx); 
}

 

3) Since setting/unsetting ctx.root is pretty much pushing and popping the stage from the scope we could add explicit methods to do that, hopefully making the intention more obvious to the future developer stumbling across is.

FooStage::prepare() { 
... 
ctx.pushStageScope(this); // can assert that no other stage is in scope
auto code = expr->compile(ctx);
ctx.popStageScope();// can assert that is a stage to pop.
}
 
BarNotAStage::prepare() {
 // ctx.root is expected to not be set here
expr->compile(ctx); it
}

4)  1,2,3 have weakness of requiring that a dev remembers t do a cleanup one way or the other. Alternative can be a RAII / immutable-stack approach:
 

FooStage::prepare() { 
...
CompileCtx myCtx = ctx.pushStageScope(this); // result in a "copy", doesnt mutate original ctx.
auto code = expr->compile(myCtx);
} // myCtx is out of scope, making leak impossible.
 
BarNotAStage::prepare() { 
// ctx.root is expected to not be set here
expr->compile(ctx);
}

 

I'm inlined to suggest approach 3) is probably the best we can have without overcomplicating or making invasive changes.

Thoughts? martin.neupauer anton.korshunov

 

 

 

 

Comment by Anna Wawrzyniak [ 16/Dec/21 ]

anton.korshunov  Does the new optimizer also use PlanStages for some legitimate reasons,

or is it that PlanStages are not used in the new optimizer (and would eventually become a dead code) and the new optimizer only passes the empty PlanStage to the CompileCtx because CompileCtx is tied to it and expects it?

If the latter, then maybe CompileCtx should be refactored in a way that it has no dependency on PlanStage type at all?

In current master it seems that all production code actually sets and expects the root PlanStage, so it seems appropriate that fix in master should be adding an tassert to guard current contract. While making it optional or not having it at all would be a separate change in the optimizer branch. Thoughts?

Comment by Kyle Suarez [ 21/Sep/21 ]

Or tassert, instead of invariant.

Generated at Thu Feb 08 05:48:37 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.