Investigate why the explain command for encrypted dbs fails to propagate the clusterTime to shard correctly

XMLWordPrintableJSON

    • Server Security
    • Fully Compatible
    • ALL
    • CAR Team 2026-03-02, Server Security 2026-03-13
    • 🟩 Routing and Topology
    • None
    • None
    • None
    • None
    • None
    • None

       

      Thanks to SPM-4330 it has been observed the $clusterTime for an explain command dispatched by an encrypted db fails to propagate the latest known clusterTime to the shard, propagating in some cases a stale one.

      If the shard didn't know about the latest clusterTime yet, an explain run with "afterClusterTime" will fail with 

      assert: command failed: {
       "ok" : 0,
       "errmsg" : "Explain command on shard config failed :: caused by :: readConcern afterClusterTime value must not be greater than the current clusterTime. Requested clusterTime: { ts: Timestamp(1770646938, 94) }; current clusterTime: { ts: Timestamp(1770646938, 93) }",
       "code" : 72,
       "codeName" : "InvalidOptions",
       "$clusterTime" : {
       "clusterTime" : Timestamp(1770646938, 94),
       "signature" : {
       "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
       "keyId" : NumberLong(0)
       }
       },
       "operationTime" : Timestamp(1770646938, 94)
      } with original command request: {
       "explain" : {
       "count" : "explain_count",
       "query" : {
       "_id" : 0
       },
       "readConcern" : {
       "afterClusterTime" : Timestamp(1770646938, 94)
       }
       },
       "verbosity" : "queryPlanner",
       "lsid" : {
       "id" : UUID("2b6d3aad-025e-4c9a-84f1-a884cc80ad77")
       },
       "$clusterTime" : {
       "clusterTime" : Timestamp(1770646938, 94),
       "signature" : {
       "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
       "keyId" : NumberLong(0)
       }
       },
       "$traceCtx" : {
       "traceparent" : "00-8aa92f584a28b96965c0d10826019a2d-4c5b7c2d077a42ec-01"
       }
      } 

      The error is confirmed to be thrown by a shard 

      [j0:s1:prim] {"t":{"$date":"2026-02-09T14:22:18.716+00:00"},"s":"I",  "c":"COMMAND",  "id":21962,   "svc":"S", "ctx":"conn64","msg":"Assertion while executing command","attr":{"command":"explain","db":"count_explain","commandArgs":{"explain":{"count":"explain_count","query":{"_id":{"$eq":0}},"encryptionInformation":{"type":1,"schema":{"count_explain.explain_count":{"escCollection":"enxcol_.explain_count.esc","ecocCollection":"enxcol_.explain_count.ecoc","fields":[{"keyId":{"$uuid":"60d60da8-536c-4388-9bcc-cd041635e650"},"path":"ssn","bsonType":"string","queries":{"queryType":"equality","contention":8}}]}},"crudProcessed":true}},"verbosity":"allPlansExecution","readConcern":{"afterClusterTime":{"$timestamp":{"t":1770646938,"i":94}}},"shardVersion":{"e":{"$oid":"6989ed9a7340b93f33392e49"},"t":{"$timestamp":{"t":1770646938,"i":45}},"v":{"$timestamp":{"t":1,"i":1}},"nss":"count_explain.explain_count"},"$clusterTime":{"clusterTime":{"$timestamp":{"t":1770646938,"i":93}}},"$configTime":{"$timestamp":{"t":1770646938,"i":92}},"$topologyTime":{"$timestamp":{"t":1770646933,"i":52}},"$client":{"application":{"name":"MongoDB Shell"},"driver":{"name":"MongoDB Internal Client","version":"8.3.0-alpha3"},"os":{"type":"Linux","name":"Ubuntu","architecture":"aarch64","version":"22.04"},"mongos":{"host":"5b4b6a539f61:20005","client":"127.0.0.1:36088","version":"8.3.0-alpha3"}},"mayBypassWriteBlocking":false,"$db":"count_explain"},"error":"InvalidOptions: readConcern afterClusterTime value must not be greater than the current clusterTime. Requested clusterTime: { ts: Timestamp(1770646938, 94) }; current clusterTime: { ts: Timestamp(1770646938, 93) }"}} 

      via an injected log is possible to spot the received command by the shard has a stale $clusterTime 

      [j0:s1:prim] {"t":{"$date":"2026-02-09T14:22:18.716+00:00"},"s":"I",  "c":"-",        "id":0,       "svc":"S", "ctx":"conn64","msg":"readRequestMetadata { readConcern: { afterClusterTime: Timestamp(1770646938, 94) }, $clusterTime: { clusterTime: Timestamp(1770646938, 93) }, $client: { application: { name: \"MongoDB Shell\" }, driver: { name: \"MongoDB Internal Client\", version: \"8.3.0-alpha3\" }, os: { type: \"Linux\", name: \"Ubuntu\", architecture: \"aarch64\", version: \"22.04\" }, mongos: { host: \"5b4b6a539f61:20005\", client: \"127.0.0.1:36088\", version: \"8.3.0-alpha3\" } }, shardVersion: { e: ObjectId('6989ed9a7340b93f33392e49'), t: Timestamp(1770646938, 45), v: Timestamp(1, 1), nss: \"count_explain.explain_count\" }, $configTime: Timestamp(1770646938, 92), $topologyTime: Timestamp(1770646933, 52), mayBypassWriteBlocking: false } cmdRequiresAuth true"} 

       

      1. shell sends a command with $clusterTime Timestamp(1770646938, 94)

      2. mongos sends a command with $clusterTime Timestamp(1770646938, 93) but afterClusterTime (1770646938, 94)
      3. Shard that still is in (1770646938, 93) cannot serve the operation 

      To reproduce just remove the "pinToSingleMongos = true" from the explain_count.js and run 

      python buildscripts/resmoke.py run --storageEngine=wiredTiger --storageEngineCacheSizeGB=0.5 --mongodSetParameters={logComponentVerbosity: {verbosity: 0}} --mongosSetParameters={logComponentVerbosity: {verbosity: 0}} --jobs=1 --installDir bazel-bin/install-dist-test/bin --suite=sharded_collections_jscore_passthrough_with_config_transitions_and_add_remove_shard src/mongo/db/modules/enterprise/jstests/fle2/query/explain_count.js --repeatTests 3 

      Which reproduces quite consistently

       

            Assignee:
            Erwin Pe
            Reporter:
            Enrico Golfieri
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated:
              Resolved: