[SERVER-68575] Investigate how to store storage state with aggregation pipelines that can be shared across ClientCursors (e.g. $exchange) Created: 04/Aug/22  Updated: 04/Aug/22  Resolved: 04/Aug/22

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Task Priority: Major - P3
Reporter: Dianna Hohensee (Inactive) Assignee: Dianna Hohensee (Inactive)
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
is depended on by SERVER-62804 Support retention of valid cursor acr... Backlog
Sprint: Execution Team 2022-08-08
Participants:

 Comments   
Comment by Dianna Hohensee (Inactive) [ 04/Aug/22 ]

When aggregation sets up multiple ClientCursors, the ClientCursors will never be accessed in that same thread. The aggregation command will setup the ClientCursors' PlanExecutors/Stages but never fetch any data through them before depositing them in the CursorManager and exiting.

Therefore, I do not believe we need to set up any RecoveryUnit state for these ClientCursors. Rather, we can just flag the ClientCursors/PlanExecutors that they should stash the RecoveryUnit after the first getMore command accessing data.

Note: this solution resolves the issue I realized would exist with lock-free reads if we were to use multiple RecoveryUnits. LFR specially sets up the first RecoveryUnit via AutoGetCollection*, and then subsequent RecoveryUnits would be uninitialized. We could have tried to duplicate the initialization, but fortunately I don't think we need to do so because we can skip stashing any RecoveryUnits with the ClientCursors.

Comment by Dianna Hohensee (Inactive) [ 04/Aug/22 ]

Notes: Examples of possible PlanExecutor stage trees/dependencies

  ClientCursor 1 (RU1)    ClientCursor 2 (RU 2)
    |                         |
     \                       /
    
 
       SBEExchange
          |
         SBEScan (RU 3)
 
 
void restore() {
     invariant(_cursor->session() == _opCtx->session())
}
 
 
 
 
 (thread 1)   (thread 2)
      |       |
   Group     Group
   /          |
 Pscan A     Pscan B
 
 
    
 
 
Lookup 1      Lookup 2
|                 |
DocumentSourceExchange
      |
 DocumentSourceMatch
       |
DocumentSourceCursor (does batching)
 
 SBEExchangeConsumer     SBEExchangeConsumer
    \                    /
                  SBEExchange
                      |
                    Scan
 
 
 
ClientCursor 1           ClientCursor 2
   |
 Lookup
   |                        |
   \                        |
ExchangeConsumer          ExchangeConsumer
          |                |
          |                |
           \            /
              Exchange (mutex)
          <yielding only happens below here>
                |
            DocumentSourceCursor (does batching)
                |
               Filter
                |
               Scan (cursors under Session 1)
 
 
 
 
 
PlanExecutor
 |
FETCH
  |
IXSCAN

Generated at Thu Feb 08 06:11:11 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.