[SERVER-50286] Indicate if operation is reading from a view in the log file Created: 13/Aug/20 Updated: 29/Oct/23 Resolved: 12/May/21 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Querying |
| Affects Version/s: | None |
| Fix Version/s: | 5.0.0-rc0 |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Chris Harris | Assignee: | Irina Yatsenko (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | neweng, read-only-views | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||
| Sprint: | Query Execution 2021-04-19, Query Execution 2021-05-03, Query Execution 2021-05-17 | ||||||||||||||||||||
| Participants: | |||||||||||||||||||||
| Description |
|
Currently there is no indication in the log file for slow query entries that the target of the operation is a view. This can be very confusing since, for example, the operation could report using an index for a field that is seemingly not used or have a sort when none is requested directly. We should indicate when the target entity is a view. Such an indication could range from a simple boolean (eg isView:1) to something more complex like reporting the final generated pipeline (as is done when explain is used on such queries). |
| Comments |
| Comment by Githook User [ 12/May/21 ] | ||||||||||||||||||||||||||||||||||||||||||
|
Author: {'name': 'Irina Yatsenko', 'email': 'irina.yatsenko@mongodb.com', 'username': 'IrinaYatsenko'}Message: | ||||||||||||||||||||||||||||||||||||||||||
| Comment by Irina Yatsenko (Inactive) [ 10/May/21 ] | ||||||||||||||||||||||||||||||||||||||||||
|
After the last round of CR feedback, the logging has been changed to include the resolved pipeline and would look similar to this:
| ||||||||||||||||||||||||||||||||||||||||||
| Comment by Irina Yatsenko (Inactive) [ 07/May/21 ] | ||||||||||||||||||||||||||||||||||||||||||
|
After the CR, settled on the following format for the new property:
Note: in sharded environment, the situation is more complicated. The view resolution is done on the shard that contains the collection, and it might get logged if the operation that encompasses the resolution is slower than slowms. It will have errName set to "CommandOnShardedViewNotSupportedOnMongod". On mongos side, "resolvedViews" won't be set in the slow query log entry but it will contain a separate record in the log for "CommandOnShardedViewNotSupportedOnMongod" with the views base collection and the resolved pipeline. | ||||||||||||||||||||||||||||||||||||||||||
| Comment by Chris Harris [ 29/Apr/21 ] | ||||||||||||||||||||||||||||||||||||||||||
|
The proposed syntax LGTM. Regarding duplicating entries, it is not clear to me if that listing them multiple times would provide any particular value. Perhaps there is an argument to be had that it would further enhance the capabilities for related tooling that attempted to consume these log entries, but I am having a difficult time constructing an example of what that additional functionality would be. So I am in favor of de-duplicating the entries, especially if there is the possibility that not doing so could result in large log lines in some situations along with truncation of other parts of the entry. From a support perspective, the knowledge that the operation is on a view will probably result in us gathering explain output as the next step in almost all situations. That additional output will provide us with a more complete view (no pun intended) of the overall aggregation that was executed as well as the collection(s) involved. There is limited analysis and recommendations that we can make based only on the additional knowledge of what collections/views are involved from the log entry. The information that that potentially could be captured in the log file which would be more actionable is the actual aggregation pipeline that is executed. However that is (far) outside the scope of this ticket and not a change that we are requesting at this time. We could potentially create another ticket to consider that in the future, but that's not specifically necessary either. | ||||||||||||||||||||||||||||||||||||||||||
| Comment by Irina Yatsenko (Inactive) [ 29/Apr/21 ] | ||||||||||||||||||||||||||||||||||||||||||
|
Realized, that we can add to the log all resolved views. The new property in the logs will look similar to the one below, which was generated for a query against invpercustomer view with a lookup on types view:
For the cases when the same view is used multiple times (directly or inside a dependency chain), is it desirable to log it multiple times or to remove the duplication? | ||||||||||||||||||||||||||||||||||||||||||
| Comment by Kyle Suarez [ 28/Apr/21 ] | ||||||||||||||||||||||||||||||||||||||||||
|
irina.yatsenko, this sounds fine to me, with a slight preference to maybe change the formatting to look something like
Ultimately though, I'll defer to christopher.harris as the final judge on what would be most valuable to the support team. | ||||||||||||||||||||||||||||||||||||||||||
| Comment by Irina Yatsenko (Inactive) [ 28/Apr/21 ] | ||||||||||||||||||||||||||||||||||||||||||
|
christopher.harris, kyle.suarez, what do you think about adding the highlighted field to the attributes of a "slow query" log entry?
The above would be logged for db.toysbytypens.find({}) query against toysbytypens view that is based on types view that is based on catalog collection. Limitations:
Would this be useful/sufficient to improve the experience? | ||||||||||||||||||||||||||||||||||||||||||
| Comment by Kyle Suarez [ 31/Mar/21 ] | ||||||||||||||||||||||||||||||||||||||||||
|
In the case of a view on a view on a view, etc., should we log the entire chain of views? Or only name the "final" resolved collection namespace? If we go with the entire chain, it sounds like the most natural place to put this is right after the view resolution code that traverses the views DAG. |