[SERVER-56594] RSM not processing response Created: 04/May/21  Updated: 20/May/21  Resolved: 19/May/21

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Question Priority: Major - P3
Reporter: Konstantin Krasnov Assignee: Edwin Zhou
Resolution: Incomplete Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File rs_status.json    
Participants:

 Description   

Hi,

After updating to 4.4.5, we began to receive the following message:

 

{"t":{"$date":"2021-05-04T10:49:50.243+03:00"},"s":"I", "c":"-", "id":4495400, "ctx":"ReplicaSetMonitor-TaskExecutor","msg":"RSM not processing response","attr":{"error":{"code":0,"codeName":"OK"},"replicaSet":"xxxx"}}
 

Is this normal behavior?

 



 Comments   
Comment by Konstantin Krasnov [ 20/May/21 ]

Thanks Edwin!

Comment by Edwin Zhou [ 19/May/21 ]

Thanks kkrasnov@gmail.com, I'll close this out as incomplete as we are unable to obtain diagnostics covering the beginning of this occurrence.

Comment by Konstantin Krasnov [ 19/May/21 ]

Hi Edwin,

We won't be able to watch it anymore. Logs for 05/01/2021 have already been removed.

If we have a similar situation, I will save logs.

 

Best regards,

Konstantin

Comment by Edwin Zhou [ 18/May/21 ]

Hi kkrasnov@gmail.com,

Glad to hear that restarting the affect nodes stops those error messages. I'm curious as to why the RSM was shut down in the first place, do you still have logs from 05/01/2021?

Thanks,
Edwin

Comment by Konstantin Krasnov [ 18/May/21 ]

Hi Edwin,

I restarted one affected node). After restarting this node, the error disappeared on both affected nodes. Now we do not have such messages on all cluster nodes.

I uploaded the log file.

 

Thank you!

 

Best regards,

Konstantin

Comment by Edwin Zhou [ 17/May/21 ]

Hi kkrasnov@gmail.com,

Were you able to restart the affected nodes? Can you also upload the log files for when the log lines first start appearing? I suspect they begin sometime shortly after starting the mongod after upgrading them.

Best,
Edwin

Comment by Konstantin Krasnov [ 13/May/21 ]

Hi Edwin,

This issue affecting 2 nodes.

We will restart this nodes in a few days. I will inform you of the results.

 

Best regards,

Konstantin

Comment by Edwin Zhou [ 13/May/21 ]

Hi kkrasnov@gmail.com,

Thank you for uploading the files.

My investigation leads me to believe that at some point the RSM was shutdown for an unknown reason.

  • Does this issue affecting multiple nodes or only the one you provided diagnostics for?
  • Can you try restarting the affected node and let us know if this log message still shows up?
  • To help us further investigate this issue, can you provide the log files for the affected node for when this issue first starts occurring?

Best,
Edwin

Comment by Konstantin Krasnov [ 13/May/21 ]

Hi Edwin,

I have uploaded files.

 

Best regards,

Konstantin

Comment by Edwin Zhou [ 11/May/21 ]

Hi kkrasnov@gmail.com,

I've created a secure upload portal for you. Files uploaded to this portal are visible only to MongoDB employees and are routinely deleted after some time.

Best,
Edwin

Comment by Konstantin Krasnov [ 11/May/21 ]

Will this information be available to everyone?

Comment by Edwin Zhou [ 11/May/21 ]

Hi kkrasnov@gmail.com,

Thanks for providing that information.

Would you please archive (tar or zip) the $dbpath/diagnostic.data directory (the contents are described here), and mongod.log files that cover these log lines and attach it to this ticket?

Best,
Edwin

Comment by Konstantin Krasnov [ 11/May/21 ]

Hi Edwin,

  1. 4.4.2 -> 4.4.5
  2. 5 nodes
  3. rs_status.json
Comment by Edwin Zhou [ 10/May/21 ]

Hi kkrasnov@gmail.com,

Thanks for your ticket submission. To further this investigation can you provide the following information:

  1. What version were you upgrading from?
  2. What is the topology of your cluster?
  3. The output of rs.status()

Best,
Edwin

Generated at Thu Feb 08 05:39:41 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.