[SERVER-37750] Optimized $sample stage does not yield Created: 25/Oct/18  Updated: 29/Oct/23  Resolved: 15/Nov/18

Status: Closed
Project: Core Server
Component/s: Aggregation Framework
Affects Version/s: 3.4.0, 3.6.0, 4.0.0
Fix Version/s: 3.4.19, 3.6.10, 4.0.5, 4.1.6

Type: Bug Priority: Major - P3
Reporter: Charlie Swanson Assignee: Bernard Gorman
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
Related
related to SERVER-36871 $sample can loop infinitely on orphan... Closed
related to SERVER-36872 Comment out $sample tests in testshar... Closed
related to SERVER-24664 Get rid of calls to ShardingState::ge... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v4.0, v3.6, v3.4
Sprint: Query 2018-11-19
Participants:
Case:
Linked BF Score: 50

 Description   

In the optimized $sample path where we use a pseudo-random cursor (docs), we need to insert a ShardFilterStage to ensure the $sample stage only returns documents which belong to this shard (i.e. it needs to filter out orphans and in-progress migrations). To construct this ShardFilterStage we need to consult the CollectionShardingState to get the metadata for the collection, including the shard key. Accessing this state needs to be done under a lock, so we take a lock to do so:

createRandomCursorExecutor

    auto stage = stdx::make_unique<MultiIteratorStage>(opCtx, ws.get(), collection);
    stage->addIterator(std::move(rsRandCursor));
 
    {
        AutoGetCollectionForRead autoColl(opCtx, collection->ns());
 
        // If we're in a sharded environment, we need to filter out documents we don't own.
        if (ShardingState::get(opCtx)->needCollectionMetadata(opCtx, collection->ns().ns())) {
            auto shardFilterStage = stdx::make_unique<ShardFilterStage>(
                opCtx,
                CollectionShardingState::get(opCtx, collection->ns())->getMetadata(opCtx),
                ws.get(),
                stage.release());
            return PlanExecutor::make(opCtx,
                                      std::move(ws),
                                      std::move(shardFilterStage),
                                      collection,
                                      PlanExecutor::YIELD_AUTO);
        }
    }
 
    return PlanExecutor::make(
        opCtx, std::move(ws), std::move(stage), collection, PlanExecutor::YIELD_AUTO);

This lock is actually unnecessary, because the caller of 'createRandomCursorExecutor' already holds a lock on the collection - we must hold a lock during the entirety of PipelineD::createCursorSource, since we are interacting with a raw Collection*.

Taking this extra lock may seem benign, but it actually has an unintended side-effect: double locking prevents yielding:

PlanYieldPolicy

PlanYieldPolicy::PlanYieldPolicy(PlanExecutor* exec, PlanExecutor::YieldPolicy policy)
    : _policy(exec->getOpCtx()->lockState()->isGlobalLockedRecursively() ? PlanExecutor::NO_YIELD
                                                                         : policy),

Because of this, the PlanExecutor powering the $sample stage will not yield, despite passing the explicit YIELD_AUTO policy (shown above).

This is obviously surprising. I'm not entirely certain about the motivation for this behavior, but I believe this behavior was engineered to prevent surprise yielding of locks from operations which needed to execute some sort of sub-query. For example, a replication command or internal thread may need to find the most recent oplog entry, and would not want that operation to yield the locks on the oplog, despite that being the default behavior.

It looks like this change can be traced back to this commit, which (I assume defensively - just to be sure) added the AutoGetCollection lock as part of an unrelated refactor. That extra lock has been there ever since, meaning this bug affects versions back to 3.4.



 Comments   
Comment by Githook User [ 22/Nov/18 ]

Author:

{'name': 'Bernard Gorman', 'email': 'bernard.gorman@gmail.com', 'username': 'gormanb'}

Message: SERVER-37750 Do not run server37750.js on MMAPv1 variants
Branch: v3.6
https://github.com/mongodb/mongo/commit/5b35edfd75f860ea48e5fadaf8e8e48f2d304e90

Comment by Githook User [ 21/Nov/18 ]

Author:

{'name': 'Bernard Gorman', 'email': 'bernard.gorman@gmail.com', 'username': 'gormanb'}

Message: SERVER-37750 Do not run server37750.js on MMAPv1 variants
Branch: v4.0
https://github.com/mongodb/mongo/commit/62091e1667845e6a1ebc9293f04a9d20928872e2

Comment by Githook User [ 21/Nov/18 ]

Author:

{'name': 'Bernard Gorman', 'email': 'bernard.gorman@gmail.com', 'username': 'gormanb'}

Message: SERVER-37750 Optimized $sample stage does not yield

(cherry picked from commit 4177d6d22ab3329a8607bf80a62aa03d4fb2c528)
Branch: v3.4
https://github.com/mongodb/mongo/commit/dc03cd006d7bf6bed595d73848d6d8208fa18d26

Comment by Githook User [ 20/Nov/18 ]

Author:

{'name': 'Bernard Gorman', 'email': 'bernard.gorman@gmail.com', 'username': 'gormanb'}

Message: SERVER-37750 Optimized $sample stage does not yield

(cherry picked from commit 4177d6d22ab3329a8607bf80a62aa03d4fb2c528)
Branch: v3.6
https://github.com/mongodb/mongo/commit/072f6a59fe9ae19b77507a550c50b973ef3124d2

Comment by Githook User [ 20/Nov/18 ]

Author:

{'name': 'Bernard Gorman', 'email': 'bernard.gorman@gmail.com', 'username': 'gormanb'}

Message: SERVER-37750 Optimized $sample stage does not yield

(cherry picked from commit 4177d6d22ab3329a8607bf80a62aa03d4fb2c528)
Branch: v4.0
https://github.com/mongodb/mongo/commit/bba874c3fd9f43e9f8a6820fc600f3d86e1d4099

Comment by Githook User [ 15/Nov/18 ]

Author:

{'name': 'Bernard Gorman', 'email': 'bernard.gorman@gmail.com', 'username': 'gormanb'}

Message: SERVER-37750 Optimized $sample stage does not yield
Branch: master
https://github.com/mongodb/mongo/commit/4177d6d22ab3329a8607bf80a62aa03d4fb2c528

Generated at Thu Feb 08 04:46:54 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.