[SERVER-28435] Implement KeysCollectionCacheReader Created: 22/Mar/17  Updated: 06/Dec/17  Resolved: 19/Apr/17

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: 3.5.7

Type: Task Priority: Major - P3
Reporter: Misha Tyulenev Assignee: Randolph Tan
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-28436 Implement KeysCollectionManager Closed
Backwards Compatibility: Fully Compatible
Sprint: Sharding 2017-04-17, Sharding 2017-05-08
Participants:

 Description   

The KeysCollectionReader is responsible for reading the keys in the admin.system.keys collection.
Its owned by the KeysCollectionManager and started and stopped upon transition to / from config primary to be replaced by the KeysCollectionUpdater.

1. implement KeysCollectionCache abstract class

class KeysCollectionCache {
public:
    /**
     * Refreshes cache and returns the latest known key
     */
    virtual StatusWith<KeysCollectionDocument> refresh(OperationContext* opCtx) = 0;
 
    // gets the keyDoc where min(set{keyDoc.expiresAt > forThisTime})
    virtual StatusWith<KeysCollectionDocument> getKey(const LogicalTime& forThisTime) = 0;
};

2. The KeysCollectionCacheReader must implement KeysCollectionCache API:

class KeysCollectionCacheReader : public KeysCollectionCache {
public:
    /**
     * Check if there are new documents expiresAt > latestKeyDoc.expiresAt.
     */
    StatusWith<KeysCollectionDocument> refresh(OperationContext* opCtx) override;
    StatusWith<KeysCollectionDocument> getKey(const LogicalTime& forThisTime) override;
 
private:
    std::map<LogicalTime, KeysCollectionDocument> _cache;  // expiresAt -> KeysDocument
};



 Comments   
Comment by Githook User [ 19/Apr/17 ]

Author:

{u'username': u'renctan', u'name': u'Randolph Tan', u'email': u'randolph@10gen.com'}

Message: SERVER-28435 Implement KeysCollectionCacheReader
Branch: master
https://github.com/mongodb/mongo/commit/29b24a4f3e2484fc44f9c6cda864505cce4a7782

Comment by Githook User [ 17/Apr/17 ]

Author:

{u'username': u'renctan', u'name': u'Randolph Tan', u'email': u'randolph@10gen.com'}

Message: SERVER-28435 Implement getNewKey for catalog client
Branch: master
https://github.com/mongodb/mongo/commit/637a1b6b6c0dc8f6f07e6e6aa50585b500c1350f

Comment by Githook User [ 11/Apr/17 ]

Author:

{u'username': u'renctan', u'name': u'Randolph Tan', u'email': u'randolph@10gen.com'}

Message: SERVER-28435 Skeleton code for KeysCollectionCacheReader
Branch: master
https://github.com/mongodb/mongo/commit/72e4f9167099b54732c3c537f0143e1044d7da4a

Comment by Randolph Tan [ 04/Apr/17 ]

misha.tyulenev This will come from the caller when it needs to sign the time "forThisTime"

Comment by Misha Tyulenev [ 30/Mar/17 ]

renctan Please clarify where the forThisTime argument comes from

Comment by Randolph Tan [ 30/Mar/17 ]

1. I think we need to refresh when _expiresAt < <currentTime> otherwise I can imagine a DOS attack that forces refreshes for non existent keys. So in this case it will be only refreshing when a key is expired

I think it's confusing that this method is even refrencing it's own clusterTime. For example, in the case when a node gets a new clusterTime with a new generation, this method will fail this condition. I propose instead that we have something like:

// for reading
StatusWith<Key> getKey(long long keyId, LogicalTime forThisTime) {
    if (keyId == _key.keyId) return _key;
 
    if (key < _key.keyId) return ErrorCodes::OldKey; // caller should decide what to do
    // needs to check with config
    auto status = _refresh(forThisTime);
    if (!status.isOK()) return status; // timeout, etc.
    
    if (key > _key.keyId) return ErrorCodes::KeyNotFound;
    invariant(key == _key.keyId);
    return _key;
}
 
// for signing
StatusWith<Key> getKey(LogicalTime forThisTime) {
    if (_key.expiresAt < forThisTime) {
        refreshUntilThisTime(forThisTime);
    }
    return _key;
}

#2 sounds good to me.

Comment by Misha Tyulenev [ 30/Mar/17 ]

Thanks for the feedback:
1. I think we need to refresh when _expiresAt < <currentTime> otherwise I can imagine a DOS attack that forces refreshes for non existent keys. So in this case it will be only refreshing when a key is expired. This way the "future" keys are available only when the "current" key is expired. But all the past keys are potentially ok, but the cache may later be added to the KeyCollectionManager.

2. Good point, I think that its ok using afterClusterTime with the inaccurate value (say Max uint64) so it can never be reached if there is a timeout built into the wait
In this case

  • if the request goes to the primary it will wait block the timeout expires (while checking the signature of a user command which will also block and return an error eventually)
  • if its secondary it will fire the forceNoopWrite command to the primary and wait until the time is reached - again if this is the time that is too far ahead it will just timeout.

So I think that the correct behavior is to query to the config server with afterClusterTime argument (which can directly be sent by users anyways so we need to have enough protection there from DOS attacks) but it must timeout.

Comment by Randolph Tan [ 30/Mar/17 ]

else if (keyId > _keyId && _expiresAt < <current logical time>) {

Why do we have "_expiresAt < <current logical time>" here?

I think this should not throw an error:

          else if (keyId > _keyId) {
             return InvalidSignature
          }

It is possible to hit this even with a legitimate signature if the mongos talked to a secondary config server that is behind. In other cases, mongos could use the clusterTime of the request to pass to the readAfterClusterTime when it talks to the config server, but in this case it's still checking the validity of the clusterTime so it can't just blindly use it. So this either has to wait until you get to the requested keyId (with timeout) or use temporarily use (as opposed to setting the global config opTime) the unvalidated clusterTime as readAfter argument to the query for reading the keys.

Comment by Misha Tyulenev [ 30/Mar/17 ]

renctan please review the proposal

Generated at Thu Feb 08 04:18:07 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.