[SERVER-35289] $currentOp takes database locks when causal consistency is requested Created: 30/May/18  Updated: 06/Dec/22

Status: Backlog
Project: Core Server
Component/s: Aggregation Framework
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Eric Milkie Assignee: Backlog - Query Execution
Resolution: Unresolved Votes: 0
Labels: query-44-grooming
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Assigned Teams:
Query Execution
Operating System: ALL
Participants:

 Description   

The $currentOp aggregation stage needs to take no database locks, as the currentOp command does today.



 Comments   
Comment by Tess Avitabile (Inactive) [ 01/Jun/18 ]

No, this is not blocking other work. It fell out of unrelated testing. However, it seems more important now that we can have a $currentOp blocked on a dropDatabase that is blocked on an open transaction.

Comment by Bernard Gorman [ 01/Jun/18 ]

tess.avitabile milkie behackett: A couple of followup questions to better determine how to prioritise and fix this:

  1. Is this currently blocking any other work (e.g. transactions / transactions diagnostics)? Or did this issue just fall out of unrelated testing?
  2. What should the behaviour of the drivers be in the event that $currentOp is run under causal consistency? If we were to implement the fix discussed above, would this result in the failure being bubbled back up to the client? Should we instead re-run the aggregation without the metadata?

Edit: it looks as though we currently return a generic InvalidOptions error for commands where allowsAfterClusterTime() == false. Would it make sense to return a new code for cases like $currentOp - something like "RetryWithoutAfterClusterTime" - so that drivers could tell that it is safe to resubmit a particular command without the causal metadata?

Comment by Bernard Gorman [ 31/May/18 ]

Is that because the $currentOp aggregation stage doesn't end up consulting the allowsAfterClusterTime() function in current_op_common.h? The changes from SERVER-31613 made it so the server would reject specifying "afterClusterTime" to the "currentOp" command. I think the server ought to be doing the same for the $currentOp aggregation stage then.

max.hirschhorn, this looks like the right solution to me. A $currentOp aggregation won't consult the currentOp command's allowsAfterClusterTime() since that code is entirely separate from the aggregation's PipelineCommand, but we can add an override of the method in PipelineCommand (which currently defers to the default implementation in BasicCommand) to return false in the case that the aggregation starts with $currentOp.

Comment by Max Hirschhorn [ 31/May/18 ]

This appears to happen when causal consistency is enabled.

Is that because the $currentOp aggregation stage doesn't end up consulting the allowsAfterClusterTime() function in current_op_common.h? The changes from SERVER-31613 made it so the server would reject specifying "afterClusterTime" to the "currentOp" command. I think the server ought to be doing the same for the $currentOp aggregation stage then.

We might consider having the shell detect when the user is attempting to run an aggregation with the $currentOp stage and then not send the causal metadata for that command.

We certainly could have the mongo shell and drivers not inject "afterClusterTime" for aggregations involving the $currentOp stage; however, this may run contra to the Driver team's goal of not special casing the behavior for the "aggregate" command by inspecting the pipeline.

Aggregate with $out is a read operation

We intend to migrate away from designs that require drivers to inspect the contents of the aggregation pipeline and override user read preferences for aggregate with $out. In general, our specifications should stop defining different behaviors based on the contents of commands.

https://docs.google.com/document/d/14Ouqk1t7JG2hCqEQ-f7e2qgWHH5SeVpXnjybKKLtpF4/edit#heading=h.a5djszqygppi

Comment by Eric Milkie [ 31/May/18 ]

We might consider having the shell detect when the user is attempting to run an aggregation with the $currentOp stage and then not send the causal metadata for that command. Or it could return an error.
I'm not sure it would be possible to fix this at the server level, since reading from the oplog is required for causal to work.
I believe the command isn't taking locks because the shell knows that command helper doesn't need the causal metadata added to it.

Comment by Tess Avitabile (Inactive) [ 31/May/18 ]

This appears to happen when causal consistency is enabled. For example, I ran the following function under suite causally_consistent_jscore_txns_passthrough:

(function() {
    printjson(db.getSiblingDB("admin").aggregate([{$currentOp: {}}]).toArray());
    printjson(db.currentOp().inprog);
}());

I saw that the aggregation did take locks:

{
[js_test:repro] 2018-05-31T10:32:51.067-0400 		"host" : "tesseract:20000",
[js_test:repro] 2018-05-31T10:32:51.067-0400 		"desc" : "conn14",
[js_test:repro] 2018-05-31T10:32:51.067-0400 		"connectionId" : 14,
[js_test:repro] 2018-05-31T10:32:51.067-0400 		"client" : "127.0.0.1:57674",
[js_test:repro] 2018-05-31T10:32:51.067-0400 		"appName" : "MongoDB Shell",
[js_test:repro] 2018-05-31T10:32:51.067-0400 		"clientMetadata" : {
[js_test:repro] 2018-05-31T10:32:51.068-0400 			"application" : {
[js_test:repro] 2018-05-31T10:32:51.068-0400 				"name" : "MongoDB Shell"
[js_test:repro] 2018-05-31T10:32:51.068-0400 			},
[js_test:repro] 2018-05-31T10:32:51.068-0400 			"driver" : {
[js_test:repro] 2018-05-31T10:32:51.068-0400 				"name" : "MongoDB Internal Client",
[js_test:repro] 2018-05-31T10:32:51.068-0400 				"version" : "4.1.0-104-ge7fd01e"
[js_test:repro] 2018-05-31T10:32:51.068-0400 			},
[js_test:repro] 2018-05-31T10:32:51.069-0400 			"os" : {
[js_test:repro] 2018-05-31T10:32:51.069-0400 				"type" : "Linux",
[js_test:repro] 2018-05-31T10:32:51.069-0400 				"name" : "Ubuntu",
[js_test:repro] 2018-05-31T10:32:51.069-0400 				"architecture" : "x86_64",
[js_test:repro] 2018-05-31T10:32:51.069-0400 				"version" : "15.10"
[js_test:repro] 2018-05-31T10:32:51.069-0400 			}
[js_test:repro] 2018-05-31T10:32:51.069-0400 		},
[js_test:repro] 2018-05-31T10:32:51.070-0400 		"active" : true,
[js_test:repro] 2018-05-31T10:32:51.070-0400 		"currentOpTime" : "2018-05-31T10:32:51.061-0400",
[js_test:repro] 2018-05-31T10:32:51.070-0400 		"opid" : 111,
[js_test:repro] 2018-05-31T10:32:51.070-0400 		"secs_running" : NumberLong(0),
[js_test:repro] 2018-05-31T10:32:51.070-0400 		"microsecs_running" : NumberLong(130),
[js_test:repro] 2018-05-31T10:32:51.070-0400 		"op" : "command",
[js_test:repro] 2018-05-31T10:32:51.071-0400 		"ns" : "admin.$cmd.aggregate",
[js_test:repro] 2018-05-31T10:32:51.071-0400 		"command" : {
[js_test:repro] 2018-05-31T10:32:51.071-0400 			"aggregate" : 1,
[js_test:repro] 2018-05-31T10:32:51.071-0400 			"pipeline" : [
[js_test:repro] 2018-05-31T10:32:51.071-0400 				{
[js_test:repro] 2018-05-31T10:32:51.071-0400 					"$currentOp" : {
[js_test:repro] 2018-05-31T10:32:51.071-0400 
[js_test:repro] 2018-05-31T10:32:51.072-0400 					}
[js_test:repro] 2018-05-31T10:32:51.072-0400 				}
[js_test:repro] 2018-05-31T10:32:51.072-0400 			],
[js_test:repro] 2018-05-31T10:32:51.072-0400 			"cursor" : {
[js_test:repro] 2018-05-31T10:32:51.072-0400 
[js_test:repro] 2018-05-31T10:32:51.072-0400 			},
[js_test:repro] 2018-05-31T10:32:51.072-0400 			"$clusterTime" : {
[js_test:repro] 2018-05-31T10:32:51.073-0400 				"clusterTime" : Timestamp(1527777170, 7),
[js_test:repro] 2018-05-31T10:32:51.073-0400 				"signature" : {
[js_test:repro] 2018-05-31T10:32:51.073-0400 					"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
[js_test:repro] 2018-05-31T10:32:51.073-0400 					"keyId" : NumberLong(0)
[ReplicaSetFixture:job0:primary] 2018-05-31T10:32:51.073-0400 I NETWORK  [conn14] end connection 127.0.0.1:57674 (0 connections now open)
[js_test:repro] 2018-05-31T10:32:51.073-0400 				}
[js_test:repro] 2018-05-31T10:32:51.074-0400 			},
[js_test:repro] 2018-05-31T10:32:51.074-0400 			"readConcern" : {
[js_test:repro] 2018-05-31T10:32:51.074-0400 				"afterClusterTime" : Timestamp(1527777170, 7)
[js_test:repro] 2018-05-31T10:32:51.074-0400 			},
[js_test:repro] 2018-05-31T10:32:51.074-0400 			"$db" : "admin"
[js_test:repro] 2018-05-31T10:32:51.074-0400 		},
[js_test:repro] 2018-05-31T10:32:51.075-0400 		"numYields" : 0,
[js_test:repro] 2018-05-31T10:32:51.075-0400 		"locks" : {
[js_test:repro] 2018-05-31T10:32:51.075-0400 
[js_test:repro] 2018-05-31T10:32:51.075-0400 		},
[js_test:repro] 2018-05-31T10:32:51.075-0400 		"waitingForLock" : false,
[js_test:repro] 2018-05-31T10:32:51.075-0400 		"lockStats" : {
[js_test:repro] 2018-05-31T10:32:51.075-0400 			"Global" : {
[js_test:repro] 2018-05-31T10:32:51.076-0400 				"acquireCount" : {
[js_test:repro] 2018-05-31T10:32:51.076-0400 					"r" : NumberLong(2)
[js_test:repro] 2018-05-31T10:32:51.076-0400 				}
[js_test:repro] 2018-05-31T10:32:51.076-0400 			},
[js_test:repro] 2018-05-31T10:32:51.076-0400 			"Database" : {
[js_test:repro] 2018-05-31T10:32:51.076-0400 				"acquireCount" : {
[js_test:repro] 2018-05-31T10:32:51.076-0400 					"r" : NumberLong(1)
[js_test:repro] 2018-05-31T10:32:51.077-0400 				}
[js_test:repro] 2018-05-31T10:32:51.077-0400 			},
[js_test:repro] 2018-05-31T10:32:51.077-0400 			"oplog" : {
[js_test:repro] 2018-05-31T10:32:51.077-0400 				"acquireCount" : {
[js_test:repro] 2018-05-31T10:32:51.077-0400 					"r" : NumberLong(1)
[js_test:repro] 2018-05-31T10:32:51.077-0400 				}
[js_test:repro] 2018-05-31T10:32:51.078-0400 			}
[js_test:repro] 2018-05-31T10:32:51.078-0400 		}
[js_test:repro] 2018-05-31T10:32:51.078-0400 	}

The currentOp command did not take locks:

[js_test:repro] 2018-05-31T10:32:51.099-0400 	{
[js_test:repro] 2018-05-31T10:32:51.099-0400 		"host" : "tesseract:20000",
[js_test:repro] 2018-05-31T10:32:51.099-0400 		"desc" : "conn14",
[js_test:repro] 2018-05-31T10:32:51.099-0400 		"connectionId" : 14,
[js_test:repro] 2018-05-31T10:32:51.099-0400 		"client" : "127.0.0.1:57674",
[js_test:repro] 2018-05-31T10:32:51.100-0400 		"appName" : "MongoDB Shell",
[js_test:repro] 2018-05-31T10:32:51.100-0400 		"clientMetadata" : {
[js_test:repro] 2018-05-31T10:32:51.100-0400 			"application" : {
[js_test:repro] 2018-05-31T10:32:51.100-0400 				"name" : "MongoDB Shell"
[js_test:repro] 2018-05-31T10:32:51.100-0400 			},
[js_test:repro] 2018-05-31T10:32:51.100-0400 			"driver" : {
[js_test:repro] 2018-05-31T10:32:51.100-0400 				"name" : "MongoDB Internal Client",
[js_test:repro] 2018-05-31T10:32:51.101-0400 				"version" : "4.1.0-104-ge7fd01e"
[js_test:repro] 2018-05-31T10:32:51.101-0400 			},
[js_test:repro] 2018-05-31T10:32:51.101-0400 			"os" : {
[js_test:repro] 2018-05-31T10:32:51.101-0400 				"type" : "Linux",
[js_test:repro] 2018-05-31T10:32:51.101-0400 				"name" : "Ubuntu",
[js_test:repro] 2018-05-31T10:32:51.101-0400 				"architecture" : "x86_64",
[js_test:repro] 2018-05-31T10:32:51.101-0400 				"version" : "15.10"
[js_test:repro] 2018-05-31T10:32:51.101-0400 			}
[js_test:repro] 2018-05-31T10:32:51.102-0400 		},
[js_test:repro] 2018-05-31T10:32:51.102-0400 		"active" : true,
[js_test:repro] 2018-05-31T10:32:51.102-0400 		"currentOpTime" : "2018-05-31T10:32:51.067-0400",
[js_test:repro] 2018-05-31T10:32:51.102-0400 		"opid" : 112,
[js_test:repro] 2018-05-31T10:32:51.102-0400 		"secs_running" : NumberLong(0),
[js_test:repro] 2018-05-31T10:32:51.102-0400 		"microsecs_running" : NumberLong(110),
[js_test:repro] 2018-05-31T10:32:51.102-0400 		"op" : "command",
[js_test:repro] 2018-05-31T10:32:51.103-0400 		"ns" : "admin.$cmd.aggregate",
[js_test:repro] 2018-05-31T10:32:51.103-0400 		"command" : {
[js_test:repro] 2018-05-31T10:32:51.103-0400 			"currentOp" : 1,
[js_test:repro] 2018-05-31T10:32:51.103-0400 			"$clusterTime" : {
[js_test:repro] 2018-05-31T10:32:51.103-0400 				"clusterTime" : Timestamp(1527777170, 7),
[js_test:repro] 2018-05-31T10:32:51.103-0400 				"signature" : {
[js_test:repro] 2018-05-31T10:32:51.104-0400 					"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
[js_test:repro] 2018-05-31T10:32:51.104-0400 					"keyId" : NumberLong(0)
[js_test:repro] 2018-05-31T10:32:51.104-0400 				}
[js_test:repro] 2018-05-31T10:32:51.104-0400 			},
[js_test:repro] 2018-05-31T10:32:51.104-0400 			"$db" : "admin"
[js_test:repro] 2018-05-31T10:32:51.104-0400 		},
[js_test:repro] 2018-05-31T10:32:51.104-0400 		"numYields" : 0,
[js_test:repro] 2018-05-31T10:32:51.105-0400 		"locks" : {
[js_test:repro] 2018-05-31T10:32:51.105-0400 
[js_test:repro] 2018-05-31T10:32:51.105-0400 		},
[js_test:repro] 2018-05-31T10:32:51.105-0400 		"waitingForLock" : false,
[js_test:repro] 2018-05-31T10:32:51.105-0400 		"lockStats" : {
[js_test:repro] 2018-05-31T10:32:51.105-0400 
[js_test:repro] 2018-05-31T10:32:51.105-0400 		}
[js_test:repro] 2018-05-31T10:32:51.105-0400 	}

The aggregation did not take locks when causal consistency was not enabled.

Comment by Eric Milkie [ 31/May/18 ]

I defer to tess.avitabile; she was experiencing the locking behavior and can help diagnose.

Comment by Bernard Gorman [ 31/May/18 ]

milkie - does it? Part of the work during the initial $currentOp implementation was to ensure that it avoided taking any locks (see here and the accompanying code review). There's a test here which is designed to confirm that $currentOp is able to retrieve results while the server is locked.

Also: if the $currentOp agg stage takes locks, then I'd expect that the currentOp command also does? The latter has been internally implemented as a $currentOp aggregation ever since the agg stage was added.

Where are you seeing these locks being taken? I don't think it's anywhere in the aggregation code itself - is it somewhere in the command preamble?

Generated at Thu Feb 08 04:39:24 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.