[SERVER-56339] Traceability Between mongos and mongod Cursors Created: 26/Apr/21  Updated: 14/Feb/23

Status: Backlog
Project: Core Server
Component/s: Logging
Affects Version/s: 4.2.13, 4.4.5, 4.0.24
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Diego Rodriguez (Inactive) Assignee: Backlog - Query Execution
Resolution: Unresolved Votes: 1
Labels: query, sharding
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
Assigned Teams:
Query Execution
Participants:

 Description   

Hi Team,

Currently, when issuing queries against a Sharded Cluster (v4.0.x, v4.2.x, and v4.4.x) there are different cursor ids participating in the operation:

  1. The cursor id opened against the mongos.
  2. The cursor ids that the mongos opens against each of the corresponding mongods that must participate in the query.

When looking at the MongoDB Logs for a specific query, the operation that's started in the mongos can be tied to the mongods using the lsid but one lsid can potentially work with more than one cursor id during its lifespan. For troubleshooting purposes, it would be useful to be able to also tie cursor ids (1) and (2) for a given operation.

Since logging the downstream cursor ids that are opened against each Shard in the mongos logs can potentially result in large log entries for big Sharded Clusters, I believe the mongods participating in the query could log the cursor id from the mongos as part of the $client details, for example:

 $client: { 
     application: { name: "MongoDB Shell" }, 
     driver: { name: "MongoDB Internal Client", version: "4.2.13" }, 
     os: { type: "Linux", name: "Ubuntu", architecture: "x86_64", version: "20.04" }, 
     mongos: { host: "hostname:27017", client: "127.0.0.1:43120", version: "4.2.13", cursorid: 4622046380930068847 }
     }

This is based on the assumption that a logged "slow" operation in a mongod that's tied to a mongos request is going to result in a "slow" operation logged in the corresponding mongos so both components will have the required troubleshooting information. Please correct me if there might be some edge case where this won't happen.

Would it be possible to incorporate this information in the "slow" query logs?

Regards
Diego



 Comments   
Comment by Kyle Suarez [ 29/Apr/21 ]

Adding this for consideration to the Query Execution backlog, and potentially considering opening an Epic to track this.

Generated at Thu Feb 08 05:39:00 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.