[SERVER-19982] Log diagnostic message when creating "no timeout" cursors Created: 17/Aug/15  Updated: 06/Dec/22

Status: Backlog
Project: Core Server
Component/s: Querying
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Kevin Pulo Assignee: Backlog - Query Execution
Resolution: Unresolved Votes: 2
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
is related to SERVER-3090 Add the ability to list open cursors Closed
Assigned Teams:
Query Execution
Participants:

 Description   

As a follow-on from SERVER-19981, it would be very useful when trying to track down an unknown source of noTimeout cursors if queries that produced them were logged at the default level.

That is, in addition to the millis > slowms logic at logLevel 0, queries that have QueryOption_NoCursorTimeout set and produce a cursor (ie. not exhausted by the first batch and ntoreturn >= 0), should also be logged by default.



 Comments   
Comment by J Rassi [ 19/Jan/16 ]

Won't setting logLevel 1 for the query component also cause all queries to be logged (not just those over slowms)? Busy systems can't do this for extended periods. If there is a way to log query-related events without also logging every query, then this would be acceptable.

Related: note that SERVER-21776 changes the component of the per-operation diagnostic log message from "QUERY" to "COMMAND". Log level 1 for the "QUERY" component is now much less noisy than it used to be.

Comment by Kevin Pulo [ 25/Aug/15 ]

Won't setting logLevel 1 for the query component also cause all queries to be logged (not just those over slowms)? Busy systems can't do this for extended periods. If there is a way to log query-related events without also logging every query, then this would be acceptable.

Another option might be to add another logging component "cursor", which (at higher levels) logs cursor creation (including options), exhaustion, kill, and timeout, and anything else of relevance to cursors or the cursor manager.

Either way, the associated source information (at least connection, namespace and query) has to be 100% clear from the cursor-related log message (even in situations where the associated query wasn't logged).

Comment by J Rassi [ 24/Aug/15 ]

One is where the problematic cursors only appear occasionally/infrequently, and it's not feasible to leave log level 1 on for extended periods. The other is where noTimeout cursors are being used often, but are being correctly exhausted/killed. In this case, these messages are needless and will only clutter the level 1 logs.

The introduction of the "log component" concept was intended to make it feasible for an individual component to be set to a higher level than normal for extended periods. I can imagine adding a diagnostic message in the cursor manager (query component, log level 1 or 2?) that logs creation and destruction of all cursors; the associated diagnosis workflow would be "run db.setLogLevel(1, 'query'), leave it running for a day, extract from the log a list of all no-timeout cursors created, verify which ones were never destroyed."

Since the actual problem is noTimeout cursors that aren't exhausted/killed, might it instead be possible to log when a noTimeout cursor would have timed out?

This may be feasible, though I'd prefer the above solution.

Comment by Kevin Pulo [ 24/Aug/15 ]

Log level 1 would be okay for egregious cases, where problematic cursors are created frequently. But there are two counter cases. One is where the problematic cursors only appear occasionally/infrequently, and it's not feasible to leave log level 1 on for extended periods. The other is where noTimeout cursors are being used often, but are being correctly exhausted/killed. In this case, these messages are needless and will only clutter the level 1 logs.

Since the actual problem is noTimeout cursors that aren't exhausted/killed, might it instead be possible to log when a noTimeout cursor would have timed out? I guess this depends on whether the cursor manager somehow "skips over" noTimeout cursors when timing them out, or whether noTimeout cursors simply aren't even considered for expiry/timeout.

Comment by J Rassi [ 24/Aug/15 ]

I agree with Kevin that there should be a log message for this event, but I don't agree that log level 0 is appropriate.

Comment by Andy Schwerin [ 24/Aug/15 ]

Sounds like a lot of log messages, to me. Either you'll get zero or a ton, and a ton seems bad.

Comment by J Rassi [ 21/Aug/15 ]

I suggest having the cursor manager log a diagnostic message when a "no timeout" cursor is created.

Generated at Thu Feb 08 03:52:46 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.