[SERVER-31916] Initial request to a shardsvr mongod can return a clustertime signed with the null key Created: 10/Nov/17 Updated: 30/Oct/23 Resolved: 15/Dec/17 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Sharding |
| Affects Version/s: | 3.6.0-rc4 |
| Fix Version/s: | 3.7.1 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Mira Carey | Assignee: | Misha Tyulenev |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | todo_in_code | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||
| Operating System: | ALL | ||||||||||||||||
| Sprint: | Sharding 2018-01-01, Sharding 2017-12-18 | ||||||||||||||||
| Participants: | |||||||||||||||||
| Case: | (copied to CRM) | ||||||||||||||||
| Description |
|
When interacting with a mongod in a sharded cluster, the first time a client connects directly to a mongod (instead of via mongos) it can receive a null signed clustertime. Ordinarily, this will only happen when the client has the special authorized to advance clock privilege, but it can also happen the first time an unprivileged client communicates (if that's before keys have been synced). When that client later attempts to gossip the time, they can receive a
style error. This will only occur when the cluster itself has auth enabled (as otherwise no validation is necessary). For current tests, that involves blacklisting:
and forcing jstests/libs/override_methods/validate_collections_on_shutdown.js to abort if it sees KeyNotFound. We should come up with a strategy to handle this and remove the blacklist |
| Comments |
| Comment by Githook User [ 15/Dec/17 ] | |||||||
|
Author: {'name': 'Misha Tyulenev', 'email': 'misha@mongodb.com', 'username': 'mikety'}Message: | |||||||
| Comment by Misha Tyulenev [ 13/Dec/17 ] | |||||||
|
behackett the specific check will introduce the dependency on the $clusterTime format. and it may affect our ability to change this field in the future releases. So I suggest to not assume a specific $clusterTime structure if there is a forward compatibility requirement. | |||||||
| Comment by Bernie Hackett [ 07/Dec/17 ] | |||||||
|
Could we make "$clusterTime.signature.keyId === 0" a valid check that drivers can do, or provide some other way for a driver to know that it shouldn't gossip a particular $clusterTime value? | |||||||
| Comment by Misha Tyulenev [ 07/Dec/17 ] | |||||||
|
Its a valid state for a mongod to be available but return dummy signature. While dummy signatures are easy to recognize, as the $clusterTime.signature.keyId === 0 I don't advise on drivers making any assumptions about $clusterTime format There is a way to make it more reliable by adding a refresh to the time signing code on mongod but this may cause slight performance degradation, so let me know how important this is. Still the scenarios where mongod is unable to respond due to validation errors is possible but less likely | |||||||
| Comment by Bernie Hackett [ 07/Dec/17 ] | |||||||
|
jeff.yemin has a theory that this will cause problems for drivers in general.
Granted, this assumes you are connecting directly to a shard on purpose. | |||||||
| Comment by Misha Tyulenev [ 07/Dec/17 ] | |||||||
|
This is a self-fixing issue because the error returns the correct signature. This change patches the shell to wait for the valid signature in the ping response. | |||||||
| Comment by Andy Schwerin [ 07/Dec/17 ] | |||||||
|
I believe the plan is to work around the behavior in the shell. Avoiding the race condition that causes it in the server is difficult, and the behavior should only affect newly started servers and new clusters. misha.tyulenev, can you confirm? | |||||||
| Comment by Misha Tyulenev [ 13/Nov/17 ] | |||||||
|
It does not seem to be a blocker because: | |||||||
| Comment by Ian Whalen (Inactive) [ 13/Nov/17 ] | |||||||
|
acm since this is 3.6 Required I'm assigning to Platforms to make sure it doesn't get lost. |