[SERVER-39573] Add unique id to log label "TransactionCoordinator" Created: 13/Feb/19  Updated: 29/Oct/23  Resolved: 22/Aug/19

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: 4.1.8
Fix Version/s: 4.2.1, 4.3.1

Type: Bug Priority: Major - P3
Reporter: Randolph Tan Assignee: Jack Mulrow
Resolution: Fixed Votes: 0
Labels: neweng
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v4.2
Sprint: Sharding 2019-08-12, Sharding 2019-08-26
Participants:

 Description   

Currently, the label here is just a static string and if there are multiple threads running it is hard to read the logs because they would all have the same label in the logs.



 Comments   
Comment by Githook User [ 12/Sep/19 ]

Author:

{'username': 'jsmulrow', 'email': 'jack.mulrow@mongodb.com', 'name': 'Jack Mulrow'}

Message: SERVER-41376 Track time transactions on mongos are active and inactive and include in slow txn logging

(cherry picked from commit 4d59f45f85919ddaffa260fb76d1e7dbd8950edf)

SERVER-42907 Add timeActiveMicros and timeInactiveMicros for transactions in mongos currentOp output

(cherry picked from commit 387b570d0ddf0dc87c888d030651357152484191)

SERVER-39573 Prefix TransactionCoordinator logs with transaction id

(cherry picked from commit f92b912452b540fdcbb1b3b959391fb31e64d408)

SERVER-42963 For active sessions only set transaction stats in TransactionRouter::Observer::_reportState()

(cherry picked from commit 4b526b31c6560ec4c632c8062b057d4e346fb1d0)

SERVER-42963 Add uses_atclustertime tag to router_transaction_current_op.js

(cherry picked from commit 936595473bd8423b79e477a442d3093aa11d80c8)

SERVER-41374 Track current open, active, and inactive transactions on mongos in serverStatus

(cherry picked from commit 953e1692d51c6742f1bb1c61dcfece75338afeae)
Branch: v4.2
https://github.com/mongodb/mongo/commit/523d4a46f4cdc9cc928f85eed2c63bc898c8ca63

Comment by Jack Mulrow [ 22/Aug/19 ]

I filed SERVER-42971 to track fixing the ThreadClient behavior, so in my commit for this ticket I only added a transaction id to each transaction coordinator log and did not stop setting "TransactionCoordinator" in the AWS.

Comment by Githook User [ 22/Aug/19 ]

Author:

{'username': 'jsmulrow', 'email': 'jack.mulrow@mongodb.com', 'name': 'Jack Mulrow'}

Message: SERVER-39573 Prefix TransactionCoordinator logs with transaction id
Branch: master
https://github.com/mongodb/mongo/commit/f92b912452b540fdcbb1b3b959391fb31e64d408

Comment by Jack Mulrow [ 13/Aug/19 ]

renctan, yeah I can do that instead. I'll add a helper that formats a txnId and add it to each log message.

Also, esha.maharishi and I looked into why I was seeing the threads labeled by one transaction logging for different ones, and it turns out the ThreadClient RAII type used to the set the client name to "TransactionCoordinator" also sets the thread's name without restoring the original name (if there was one) in its destructor. The AWS uses a thread pool, so it was persistently changing the names of the those threads, and the coordinator switches threads after using the waitForMajority service, so it was grabbing a thread that had already been set to the name of a different transaction.

I'm going to file a separate ticket for the ThreadClient behavior, but since we're currently polluting the names of the threads in the fixed executor pool, in addition to updating the logging I'm going to stop setting the "TransactionCoordinator" name entirely.

Comment by Randolph Tan [ 13/Aug/19 ]

jack.mulrow, what if instead of renaming the thread name, we just prefix our log messages in coordinator/coordinator service with txn + lsid? Kind of similar to txnParticipant.

Comment by Jack Mulrow [ 13/Aug/19 ]

renctan, is it fine to just append a random number to the client name here (maybe an OID like we do for migrations)? I originally thought it made sense to use a label based on the txnId of the coordinator that owns the AWS, but it looks like AWS threads can do work for other transactions, so logging the txnId in the client name wouldn't be the same as the transaction being logged for, which I think is more confusing than it's worth.

cc matthew.saltz, esha.maharishi

Comment by Kaloian Manassiev [ 15/Feb/19 ]

This might be good candidate for the diagnostics improvements.

Generated at Thu Feb 08 04:52:27 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.