[SERVER-34633] Allow $currentOp to retrieve operations from all members of each shard in a cluster Created: 24/Apr/18  Updated: 29/Oct/23  Resolved: 13/Jun/23

Status: Closed
Project: Core Server
Component/s: Aggregation Framework, Diagnostics
Affects Version/s: None
Fix Version/s: 7.1.0-rc0

Type: Improvement Priority: Major - P3
Reporter: Bernard Gorman Assignee: Adi Agrawal
Resolution: Fixed Votes: 1
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
depends on SERVER-73557 Add ability for mongos to broadcast t... Closed
Documented
is documented by DOCS-16200 [SERVER] Investigate changes in SERVE... Closed
Related
related to SERVER-45032 Allow $planCacheStats to target every... Closed
related to SERVER-73557 Add ability for mongos to broadcast t... Closed
is related to SERVER-51117 report index information on a per-sha... Closed
is related to SERVER-8136 allow db.currentOp() from mongos to s... Closed
is related to SERVER-76353 Add the ability to force a query to b... Open
is related to SERVER-44823 Sharding support for $planCacheStats Closed
Assigned Teams:
Query Execution
Backwards Compatibility: Fully Compatible
Sprint: QE 2023-05-29, QE 2023-06-12, QE 2023-06-26
Participants:
Case:

 Description   

With the introduction of the $currentOp aggregation stage, users have the ability to obtain a list of operations running on Secondaries in a sharded cluster by setting the appropriate readPreference. However, this will only provide the operations from a single eligible Secondary in each shard, and the standard approach to more fine-grained targeting - using replica set tags - is both onerous and does not satisfactorily address this shortcoming.

Add a new flag to $currentOp which, if set, stipulates that it should target every data-bearing member in each shard and return an exhaustive list of all operations running anywhere in the cluster.



 Comments   
Comment by Githook User [ 13/Jun/23 ]

Author:

{'name': 'Adityavardhan Agrawal', 'email': 'adi.agrawal@mongodb.com', 'username': 'Adityav369'}

Message: SERVER-34633 Allow $currentOp to retrieve operations from all members of each shard in a cluster
Branch: master
https://github.com/mongodb/mongo/commit/087671ac9132131271c195cabaa0377cb3687efc

Comment by Ana Meza [ 14/Feb/23 ]

Note: If the prerequisite SERVER-73557 ticket does not get in, we shouldn’t leave this ticket in Quick Wins

Comment by Davis Haupt (Inactive) [ 10/Feb/23 ]

With SERVER-73557 completed, this will likely be a good quick win nomination.

Comment by Alex Bevilacqua [ 15/Oct/20 ]

We have a helper function that can be used to filter $currentOp results across replica set members within a sharded cluster.

function printCurrentShardOperations(filter, project) {  
  var output = "";
  db.getSiblingDB("config").shards.find().forEach(function (s) {
    shardRS = new Mongo(s.host)
    print(s._id);
    shardRS.adminCommand({ replSetGetStatus: 1 }).members.forEach(function (d) {
      print("\t" + d.name + " (" + d.stateStr + ")");
      var shardMember = new Mongo(d.name);
      shardMember.setSlaveOk(true);
      shardMember.getDB("admin").aggregate([
        { $currentOp : { allUsers: true, idleSessions: true } },
        { $match: filter },
        { $project: project }
      ]).forEach(function(currentOp) {
        printjson(currentOp);
      })
    });
  })  
}
 
printCurrentShardOperations(
{ msg: /^Index Build/ },
{ msg: 1, progress: 1, secs_running: 1 }
)

This will produce results similar to:

shard01
	localhost:27018 (PRIMARY)
	localhost:27019 (SECONDARY)
	localhost:27020 (SECONDARY)
shard02
	localhost:27021 (PRIMARY)
	localhost:27022 (SECONDARY)
{
	"secs_running" : NumberLong(0),
	"msg" : "Index Build (background) Index Build (background): 62366/477064 13%",
	"progress" : {
		"done" : 62366,
		"total" : 477064
	}
}
	localhost:27023 (SECONDARY)
{
	"secs_running" : NumberLong(0),
	"msg" : "Index Build (background) Index Build (background): 56853/477064 11%",
	"progress" : {
		"done" : 56854,
		"total" : 477064
	}
}

Note this is a simple example and doesn't factor in authentication, however I'm providing it here as-is to showcase one approach to this problem in the interim.

Comment by Andy Schwerin [ 02/May/18 ]

This is a pretty cool idea. I think a correct solution would need to return results as they arrived, and handle down nodes, which would make this interesting. That said, it would be a very expensive operation on large sharded clusters (hundreds of nodes), and still would omit operations that run exclusively on routers, so it would still be a somewhat incomplete view.

Generated at Thu Feb 08 04:37:20 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.