[SERVER-26789] Logging for $lookup less complete in 3.4 than 3.2 Created: 26/Oct/16  Updated: 06/Dec/22

Status: Backlog
Project: Core Server
Component/s: Aggregation Framework, Logging
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Chris Harris Assignee: Backlog - Query Execution
Resolution: Unresolved Votes: 5
Labels: query-44-grooming, storch
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
is related to SERVER-25005 Execute queries in $lookup and $graph... Closed
is related to SERVER-22622 Improve $lookup explain to indicate q... Needs Scheduling
Assigned Teams:
Query Execution
Operating System: ALL
Steps To Reproduce:

db.setProfilingLevel(0,-1) on a 3.4 mongod and then review the log file after issuing a `$lookup` (which returns results)

Participants:
Case:

 Description   

In 3.2, we log all of the find() queries that are generated from a $lookup. For example (with profile level of (0,-1)):

2016-10-26T08:33:30.400-0500 I COMMAND  [conn1] query test.bar query: { y: { $eq: 1.0 } } planSummary: IXSCAN { y: 1.0 } cursorid:15498909484 ntoreturn:0 ntoskip:0 keysExamined:101 docsExamined:101 keyUpdates:0 writeConflicts:0 numYields:1 nreturned:101 reslen:3359 locks:{ Global: { acquireCount: { r: 10 } }, Database: { acquireCount: { r: 5 } }, Collection: { acquireCount: { r: 5 } } } 0ms
2016-10-26T08:33:30.402-0500 I COMMAND  [conn1] getmore test.bar query: { y: { $eq: 1.0 } } cursorid:15498909484 ntoreturn:0 cursorExhausted:1 keyUpdates:0 writeConflicts:0 numYields:3 nreturned:449 reslen:14837 locks:{ Global: { acquireCount: { r: 18 } }, Database: { acquireCount: { r: 9 } }, Collection: { acquireCount: { r: 9 } } } 2ms
2016-10-26T08:33:30.403-0500 I COMMAND  [conn1] query test.bar query: { y: { $eq: 2.0 } } planSummary: IXSCAN { y: 1.0 } cursorid:16804514490 ntoreturn:0 ntoskip:0 keysExamined:101 docsExamined:101 keyUpdates:0 writeConflicts:0 numYields:1 nreturned:101 reslen:3353 locks:{ Global: { acquireCount: { r: 22 } }, Database: { acquireCount: { r: 11 } }, Collection: { acquireCount: { r: 11 } } } 0ms
2016-10-26T08:33:30.404-0500 I COMMAND  [conn1] getmore test.bar query: { y: { $eq: 2.0 } } cursorid:16804514490 ntoreturn:0 cursorExhausted:1 keyUpdates:0 writeConflicts:0 numYields:3 nreturned:449 reslen:14837 locks:{ Global: { acquireCount: { r: 30 } }, Database: { acquireCount: { r: 15 } }, Collection: { acquireCount: { r: 15 } } } 0ms
2016-10-26T08:33:30.405-0500 I COMMAND  [conn1] query test.bar query: { y: { $eq: 3.0 } } planSummary: IXSCAN { y: 1.0 } cursorid:14652587107 ntoreturn:0 ntoskip:0 keysExamined:101 docsExamined:101 keyUpdates:0 writeConflicts:0 numYields:1 nreturned:101 reslen:3353 locks:{ Global: { acquireCount: { r: 34 } }, Database: { acquireCount: { r: 17 } }, Collection: { acquireCount: { r: 17 } } } 0ms
2016-10-26T08:33:30.405-0500 I COMMAND  [conn1] getmore test.bar query: { y: { $eq: 3.0 } } cursorid:14652587107 ntoreturn:0 cursorExhausted:1 keyUpdates:0 writeConflicts:0 numYields:3 nreturned:449 reslen:14837 locks:{ Global: { acquireCount: { r: 42 } }, Database: { acquireCount: { r: 21 } }, Collection: { acquireCount: { r: 21 } } } 0ms
2016-10-26T08:33:30.406-0500 I COMMAND  [conn1] command test.foo command: aggregate { aggregate: "foo", pipeline: [ { $lookup: { from: "bar", localField: "x", foreignField: "y", as: "output" } }, { $limit: 3.0 } ], cursor: {} } keyUpdates:0 writeConflicts:0 numYields:1 reslen:62622 locks:{ Global: { acquireCount: { r: 44 } }, Database: { acquireCount: { r: 22 } }, Collection: { acquireCount: { r: 22 } } } protocol:op_command 7ms

I do not see the corresponding entries in 3.4

2016-10-26T07:45:10.733-0500 I COMMAND  [conn1] command test.$cmd appName: "MongoDB Shell" command: isMaster { isMaster: 1.0, forShell: 1.0 } numYields:0 reslen:174 locks:{} protocol:op_command 0ms
2016-10-26T07:45:16.039-0500 I COMMAND  [conn1] command test.bar appName: "MongoDB Shell" command: aggregate { aggregate: "foo", pipeline: [ { $lookup: { from: "bar", localField: "x", foreignField: "y", as: "output" } }, { $match: { output.val: true } }, { $limit: 1.0 } ], cursor: {} } planSummary: COLLSCAN keysExamined:0 docsExamined:199 cursorExhausted:1 numYields:6 nreturned:1 reslen:20926 locks:{ Global: { acquireCount: { r: 26 } }, Database: { acquireCount: { r: 13 } }, Collection: { acquireCount: { r: 12 } } } protocol:op_command 5ms

Same data set used for both environments.



 Comments   
Comment by David Storch [ 14/Nov/16 ]

This is a consequence of the changes made in SERVER-25005 to no longer execute queries against the foreign collection using the DBDirectClient.

I would say that this is not merely a simple problem with the server's reporting, but rather a design flaw in the implementation of $lookup. Currently, for each incoming document, $lookup runs a regular query against the foreign collection. This means that the query planner is repeatedly invoked and could in theory select a different plan each time. This is why the queries against the foreign collection were logged in older versions, but it also means that $lookup's explain output isn't very useful: see SERVER-22622.

Instead, the planner should select a single plan for the entire aggregation, rather than being invoked separately for each involved collection. This would mean that the $lookup would have a stable plan which we can report on in log lines and in explain. It also would mean that the planner could be improved to choose different join strategies, such as a hash join or sort-merge join. Furthermore, I would expect to see substantial performance benefits associated with this change. However, re-implementing $lookup in this fashion would require a large time investment from the query team.

Re-implementing $lookup to not execute as internally-issued find() operations on the foreign collection would mean that there would be no internally-issued queries to log. I believe that such log lines should not be part of the long-term story for reporting on the execution of $lookup queries.

Generated at Thu Feb 08 04:13:12 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.