Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-60466

Support drivers gossiping signed $clusterTimes to replica set --shardsvrs before addShard is run

    • Sharding NYC
    • Fully Compatible
    • v7.0, v6.0, v5.0
    • Sharding 2021-11-15, Sharding 2021-11-29, Sharding 2021-12-13, Sharding 2021-12-27, Sharding 2022-01-10, Sharding 2022-01-24, Sharding NYC 2023-05-29, Sharding NYC 2023-06-12

      The Convert a Replica Set to a Sharded Cluster flow has users take ordinary replica set members out of rotation and start them up again with --shardsvr. The user's application remains directly connected to the replica set during this step. A driver would have previously received signed $clusterTimes from the ordinary replica set members and will therefore attempt to gossip them back to the members after they've been started up again with --shardsvr.

      However, the behavior since MongoDB 3.6 has been to only initialize the LogicalClockValidator after the addShard command is run for the replica set shard and the shardIdentity document is inserted into the shard. In particular, the LogicalClockValidator isn't initialized on startup for --shardsvrs which have yet to be added to the sharded cluster.

      The LogicalClockValidator being uninitialized on startup leads the client to receive a CannotVerifyAndSignLogicalTime error response for any command request which included a signed $clusterTime. (Restarting ALL of the application servers would clear the signed $clusterTimes known to the MongoClient but would be disruptive to the user's environment.) We should instead have the replica set --shardsvr use its existing admin.system.keys collection to validate and sign new $clusterTimes.

      {  "ok" : 0,  "errmsg" : "Cannot accept logicalTime: { ts: Timestamp(1633407217, 1) }. May not be a part of a sharded cluster",  "code" : 210,  "codeName" : "CannotVerifyAndSignLogicalTime" }

      Additionally, we should have the existing keys in the admin.system.keys collection remain available for validating $clusterTimes to avoid generating errors from the replica set shard immediately switching over to the keys in the admin.system.keys collection on the config server.

      {  "operationTime" : Timestamp(1633407272, 1),  "ok" : 0,  "errmsg" : "Cache Reader No keys found for HMAC that is valid for time: { ts: Timestamp(1633407217, 1) } with id: 7015346542735261700",  "code" : 211,  "codeName" : "KeyNotFound",  "$gleStats" : {  "lastOpTime" : Timestamp(0, 0),  "electionId" : ObjectId("000000000000000000000000") },  "lastCommittedOpTime" : Timestamp(1633407272, 1),  "$configServerState" : {  "opTime" : {  "ts" : Timestamp(1633407273, 5),  "t" : NumberLong(2) } },  "$clusterTime" : {  "clusterTime" : Timestamp(1633407273, 5),  "signature" : {  "hash" : BinData(0,"U6uLmfGZRh/Cs29KzMOM3WMV6J8="),  "keyId" : NumberLong("7015430586655309838") } } }

            cheahuychou.mao@mongodb.com Cheahuychou Mao
            max.hirschhorn@mongodb.com Max Hirschhorn
            7 Vote for this issue
            21 Start watching this issue