[SERVER-33029] Support snapshot in cluster aggregate command Created: 30/Jan/18  Updated: 29/Oct/23  Resolved: 27/Mar/18

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: 3.7.4

Type: Task Priority: Major - P3
Reporter: Misha Tyulenev Assignee: Misha Tyulenev
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
depends on SERVER-33016 API to get/set lastCommittedOpTime on... Closed
depends on SERVER-33027 compute atClusterTime Closed
depends on SERVER-33062 Amend command with readConcern atClus... Closed
depends on SERVER-33702 Move sessionId and txnNumber addition... Closed
Related
related to SERVER-33683 Allow aggregation $mergeCursors stage... Closed
is related to SERVER-34014 Add unit tests to cluster aggregate Closed
Backwards Compatibility: Fully Compatible
Sprint: Sharding 2018-03-26, Sharding 2018-04-09
Participants:

 Description   

The task is to add support for readConcern level snapshot for sharded aggregate command. The implementation assumes that shards also support it by properly establishing snapshots per the passed atClusterTime argument.

Implementation:

1. Compute atClusterTime: ACT SERVER-33027

the algorithm: compute the greatest lastCommitedOpTime from the targeted shards.
should be added to cluster_commands_helpers.h as this function will be used by other cluster commands.

**
 *  Compute the lastCommittedOpTime from the targeted shards.
 */
LogicalTime computeAtClusterTime(OperationContext* opCtx, std::set<ShardId> shardIds) {

The call to this function should be done when targeted shards are determined:
https://github.com/mongodb/mongo/blob/r3.7.2/src/mongo/s/commands/cluster_aggregate.cpp#L393

2. Verify targeting

once the ACT is computed need to verify that the targeted shards had the chunks at the ACT moment. This will use multi-version routing table. Should be added to cluster_commands_helpers.h

/**
 * Verifies that the shardIds are the same as they were atClusteTime using versioned table.
 */
bool verifyTargetedShardsAtClusterTime(OperationContext* opCtx,
                                       std::set<ShardId> shardIds,
                                       LogicalTime atClusterTime) {

if the function returns false then use the current cluster time on mongos.

3. Amend the command objects sent to individual shards per API:

 lsid, txnNumber, autocommit:true, atClusterTime: ACT 

https://github.com/mongodb/mongo/blob/r3.7.2/src/mongo/s/commands/cluster_aggregate.cpp#L426 is the calling point the createCommandForTagetedShards needs to add missing info

4. Add error handling

Snapshot may return a SnapshotError error class. It needs to cause the restart of the read attempt up to configured # of retries
Add catching the error here: https://github.com/mongodb/mongo/blob/r3.7.2/src/mongo/s/commands/cluster_aggregate.cpp#L316
Make sure that the aggPassthrought that calls https://github.com/mongodb/mongo/blob/r3.7.3/src/mongo/s/client/shard.cpp#L154 retries via changing https://github.com/mongodb/mongo/blob/r3.7.3/src/mongo/s/client/shard_remote.cpp#L103

Testing

Add integration tests that validate the aggregate command returning the data in snapshot. i.e.

send command with batch size 1, establish cursors
add a few inserts
send getMore - this getMores should not return the inserted data as its in the other snapshot.



 Comments   
Comment by Githook User [ 27/Mar/18 ]

Author:

{'email': 'misha@mongodb.com', 'name': 'Misha Tyulenev', 'username': 'mikety'}

Message: SERVER-33029 support global snapshot aggregate command
Branch: master
https://github.com/mongodb/mongo/commit/fb35ca2b60583936e7c20dd5c47ee34d62b8c5d2

Comment by Charlie Swanson [ 20/Mar/18 ]

This approach sounds good to me misha.tyulenev, please link the follow-up ticket to this one and SERVER-33683 when you file it.

Comment by David Storch [ 07/Mar/18 ]

The proposed changes for SERVER-33541 include the following:

  • Adding aggregate to the whitelist of commands that support readConcern level snapshot.
  • Making the agg system use the "interrupt only" yield policy, so that locks are not yielded during execution. This ensures that we are using two-phase locking.
  • Call into logic to select the lock mode appropriately when acquiring locks in agg. We need to use a MODE_IX rather than MODE_IS lock when the agg is part of an autocommit:false transaction.
  • Implement logic to ban using readConcern level snapshot with various agg stages. This includes things like $collStats, $currentOp, and $out.
Comment by Misha Tyulenev [ 05/Mar/18 ]

david.storch this is correct. The ticket concerned mongos implementation of "sharded aggregate snapshot reads" and only assumes that the "local aggregate" command can handle the atClusterTime which it already does.
Please clarify what features SERVER-33541 will implement.

Comment by David Storch [ 02/Mar/18 ]

misha.tyulenev, FYI I am currently working on the equivalent work item for local snapshot reads: SERVER-33541. I assume that this ticket depends on the changes from SERVER-33541?

Generated at Thu Feb 08 04:32:04 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.