[SERVER-33541] Add snapshot read support for aggregation Created: 28/Feb/18  Updated: 29/Oct/23  Resolved: 09/Mar/18

Status: Closed
Project: Core Server
Component/s: Storage
Affects Version/s: None
Fix Version/s: 3.7.3

Type: Task Priority: Major - P3
Reporter: Eric Milkie Assignee: David Storch
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Documented
is documented by DOCS-11245 Docs for SERVER-32517: Parse readConc... Closed
Duplicate
is duplicated by SERVER-34547 Forbid starting a changeStream within... Closed
Related
related to SERVER-84852 Document work to make agg function wi... Closed
related to SERVER-33683 Allow aggregation $mergeCursors stage... Closed
Backwards Compatibility: Fully Compatible
Sprint: Query 2018-03-12, Query 2018-03-26
Participants:

 Description   
  1. aggregate
    1. Including support for $lookup
    2. The $out stage will be prohibited for use within pipelines executed with readConcern level snapshot due to use of metadata operations.


 Comments   
Comment by Githook User [ 09/Mar/18 ]

Author:

{'email': 'david.storch@10gen.com', 'name': 'David Storch', 'username': 'dstorch'}

Message: SERVER-33541 Add readConcern level 'snapshot' support for aggregation.
Branch: master
https://github.com/mongodb/mongo/commit/ed1e2b4d2a4987e3744484f9482fdc7a0e119e94

Comment by David Storch [ 05/Mar/18 ]

In particular, I notice that the locking code will keep a list of resources to unlock when two-phase locking is enabled. Since $lookup normally acquires and drops locks repeatedly, this list could grow quite long. Is this a problem?

This is something which we will probably fix. Dave will confirm that this issue exists and file a ticket as part of the local snapshot reads project.

Filed SERVER-33610 describing this issue.

Comment by David Storch [ 01/Mar/18 ]

Recording answers to some of my open questions after discussing with james.wahlin and milkie:

Should we make an effort to ban snapshot reads for aggregations that are actually reading metadata sources (as opposed to reading actual user data with DocumentSourceCursor, or DocumentSourceSampleFromRandomCursor)?

Yes, we should do so, in addition to prohibiting the $out stage. We also need to prohibit change streams, and probably $geoNear.

I think we could end up taking a lot of locks, or taking the same lock recursively many times. In particular $lookup could pose a problem. Any issues we foresee here? I assume we should at least write a test case to make sure this doesn't completely blow up?

No specific problems other than the one below, though agree that we should have a targeted test case for $lookup.

In particular, I notice that the locking code will keep a list of resources to unlock when two-phase locking is enabled. Since $lookup normally acquires and drops locks repeatedly, this list could grow quite long. Is this a problem?

This is something which we will probably fix. Dave will confirm that this issue exists and file a ticket as part of the local snapshot reads project.

Anything somehow related to transactions (but not snapshot reads) that I might be missing?

Not that we could think of. Eric points out that the lock acquisitions in the agg path need to be updated to have a timeout of zero, but that this work can happen in a later sweep.

Generated at Thu Feb 08 04:33:45 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.