[SERVER-56315] "[ftdc] serverStatus was very slow" due to which mongo daemon stopping abruptly & becomes stale Created: 23/Apr/21  Updated: 22/Jun/22  Resolved: 16/May/21

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 4.0.4
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Bharath Kumar CM Assignee: Dmitry Agranat
Resolution: Incomplete Votes: 0
Labels: perfomance
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: Text File mongodb-log.txt    
Operating System: ALL
Steps To Reproduce:

since ftdc data is not human readable how do we interpret this issue and what's the solution to this. I've many servers where I see this issue and causing many issues with replication between nodes.

Participants:

 Description   

since ftdc data is not human readable how do we interpret this issue and what's the solution to this. I've many servers where I see this issue and causing many issues with replication between nodes.

 

Error log:

2021-04-21T04:56:39.729+0000 I COMMAND [ftdc] serverStatus was very slow: { after basic: 2165, after asserts: 3087, after backgroundFlushing: 8785, after connections: 11612, after dur: 12922, after extra_info: 19620, after globalLock: 27972, after locks: 41890, after logicalSessionRecordCache: 51379, after network: 55947, after opLatencies: 62013, after opcounters: 67473, after opcountersRepl: 70586, after repl: 74168, after security: 78481, after storageEngine: 78629, after tcmalloc: 78629, after transactions: 78629, after transportSecurity: 78629, after wiredTiger: 78813, at end: 79004 }



 Comments   
Comment by Dmitry Agranat [ 16/May/21 ]

Hi,

We haven’t heard back from you for some time, so I’m going to close this ticket. If this is still an issue for you, please provide additional information and we will reopen the ticket.

Regards,
Dima

Comment by Dmitry Agranat [ 29/Apr/21 ]

bharath_achar@outlook.com, you can save aside mongod logs for now and we'll get to redacting them later on if needed. As mentioned earlier, we can try investigating the issue based on the diagnostic.data (the contents are described here), w/o the mongod logs. Please let us know when these are uploaded from all members of the replica set.

Please mention the exact timestamp and timezone of the event you'd like us to focus on.

Dima

Comment by Bharath Kumar CM [ 26/Apr/21 ]

Can you please share me the command to export bsondump file using jq ?

Once I've the exported data, I will just change company specific information and share the log to you so that we both are safe zone to further work on this issue.

Comment by Dmitry Agranat [ 26/Apr/21 ]

Hi bharath_achar@outlook.com, we can try investigating based on the diagnostic.data (the contents are described here), w/o the mongod logs. Please let us know when these are uploaded from all members of the replica set.

Please mention the exact timestamp and timezone of the event when the mongod process becomes stale.

Comment by Bharath Kumar CM [ 26/Apr/21 ]

@Dmitry Agranat

I'm afraid I cannot share the logs as per company policy I agree that you take some measure to secure the data and delete once done but from my end I cannot share it.

But if you could share steps isolating this issue would be of great help.

 

Regards,

Bharath Achar

Comment by Dmitry Agranat [ 26/Apr/21 ]

Hi bharath_achar@outlook.com,

For each member of the replica set, please archive (tar or zip) the mongod.log files covering the incident and the $dbpath/diagnostic.data directory (the contents are described here) and upload them to this support uploader location?

Files uploaded to this portal are visible only to MongoDB employees and are routinely deleted after some time.

Please mention the exact timestamp and timezone of the event you'd like us to focus on.

Dima

Comment by Bharath Kumar CM [ 24/Apr/21 ]

Hi @Edwin Zhou 

are you aware of this issue ?  how to troubleshoot further ? Enabling verbose 5 logs does gives more information ?

Is it truly a memory issue ? what's the next step to avoid this issue ?

Generated at Thu Feb 08 05:38:57 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.