[CDRIVER-4238] C 1.19.0 report "Cache Reader No keys found for HMAC that is valid for time" Created: 09/Dec/21  Updated: 27/Dec/21  Resolved: 27/Dec/21

Status: Closed
Project: C Driver
Component/s: None
Affects Version/s: 1.19.0
Fix Version/s: None

Type: Bug Priority: Unknown
Reporter: Vinicius Grippa Assignee: Kevin Albertson
Resolution: Cannot Reproduce Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: Text File mock_cluster_time.patch    
Issue Links:
Related
related to DRIVERS-1904 Handle invalid $clusterTime documents... Backlog

 Description   

Summary

C Driver 1.19.0 reports HMAC key issue

Environment

C 1.19.0 + MongoDB shared cluster 4.0.24.

Centos 7, MongoDB 4.0.24 (it also happens with the eoled 3.6)

Sharded cluster with multiple mongoS.

How to Reproduce

_I didn't find a concise way of reproducing it, but more info can be found here:
https://jira.mongodb.org/browse/GODRIVER-2127
https://jira.mongodb.org/browse/DRIVERS-1904
https://jira.mongodb.org/browse/PYTHON-1434_

Additional Background

This issue seems to be present in all drivers and all of them react differently to it where Python seems the one that is more tolerant. Matt mentioned in DRIVERS-1904 that this can affect up to 100% of the operations and I would consider it a high priority.



 Comments   
Comment by Kevin Albertson [ 27/Dec/21 ]

Thank you for the additional information vgrippa@gmail.com.

I attempted to reproduce with similar conditions:

  • C driver 1.19.0.
  • MongoDB server version 4.0.24 enterprise.
  • Sharded cluster with two mongos routers.
  • Authentication enabled.
  • mongoc_client_pool_t created with two mongos routers in the connection string "mongodb://localhost:27017,localhost:27018".
  • Test ran "ping" command repeatedly in two threads.
    Running the test did not result in any error after 10 minutes.

One more thing, in the event of this issue happening, is there any workaround apart from restarting the connection pool (which basically means restarting the app)?

Unfortunately, I do not see a good alternative workaround. I mocked the error by modifying the $clusterTime applied in _mongoc_topology_update_cluster_time with this patch: mock_cluster_time.patch. It mocks the $clusterTime on the first reply to a "ping" command. Errors continue until the driver receives a valid $clusterTime with a higher $clusterTime.clusterTime timestamp.

I do not think this is a bug in driver behavior. DRIVERS-1904 proposes an improvement in driver handling of this error. I linked CDRIVER-4238. Please watch DRIVERS-1904 for updates.

I suggest filing a Bug ticket in the SERVER project to help investigate the root cause. Since this occurs on 4.0.24, that rules out SERVER-52955 and SERVER-47568.

Comment by Vinicius Grippa [ 13/Dec/21 ]

FYI, the issue only happens when multiple routers are specified in the connection string. When using a single router, there is no issue.

Comment by Vinicius Grippa [ 13/Dec/21 ]

One more thing, in the event of this issue happening, is there any workaround apart from restarting the connection pool (which basically means restarting the app)?

Comment by Vinicius Grippa [ 12/Dec/21 ]

Hi Kevin,

Thanks for checking out. I saw this comment about transitionToAuth , but I'm not using it. There are no special settings in the cluster, only basic settings.

Comment by Kevin Albertson [ 12/Dec/21 ]

Thank you for the report vgrippa@gmail.com.

Is there anything unique about how mongod or mongos is started? There has been internal discussion in DRIVERS-1904 noting an instance of this occurring with the --transitionToAuth option set on mongod.

Generated at Wed Feb 07 21:20:21 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.