Fix race condition for replica set arbiter state

XMLWordPrintableJSON

    • Type: Task
    • Resolution: Unresolved
    • Priority: Unknown
    • None
    • Component/s: Evergreen Tools
    • None
    • Not Needed

      We had seen an issue with {{mongo-runner] where a replica set would fail intermittently due to an arbiter server not reporting its state as an arbiter.

      https://spruce.corp.mongodb.com/task/mongo_cxx_driver_integration_matrix_mongocryptd_integration_macos_14_debug_shared_cxx11_csfle_latest_replica_mongocryptd_patch_217b0506a4d4df5c9fff4719e2b12beac3a44588_6a19f3a79b26c80007828ed1_26_05_29_20_14_36/logs?execution=1

      I opened https://github.com/mongodb-js/devtools-shared/pull/770. Once that is merged we can bump _MR_VERSION in mongodb_runner.py.

      [2026/06/01 11:39:35.851] 2026-06-01T16:39:10.179Z mongodb-runner failed to get buildInfo, treating as closed server Error: Arbiter flag mismatch -- server should be arbiter but hello indicates it is not
      [2026/06/01 11:39:35.851]     at MongoServer._ensureMatchingMetadataColl (/Users/ec2-user/data/mci/c44f16bfc72d2c73438bf119a4fb29a8/tmp/drivers_orchestration/mongodb-runner-6.7.1/node_modules/mongodb-runner/dist/mongoserver.js:313:19)
      [2026/06/01 11:39:35.851]     at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
      [2026/06/01 11:39:35.851]     at async /Users/ec2-user/data/mci/c44f16bfc72d2c73438bf119a4fb29a8/tmp/drivers_orchestration/mongodb-runner-6.7.1/node_modules/mongodb-runner/dist/mongoserver.js:366:17
      [2026/06/01 11:39:35.851]     at async MongoServer.withClient (/Users/ec2-user/data/mci/c44f16bfc72d2c73438bf119a4fb29a8/tmp/drivers_orchestration/mongodb-runner-6.7.1/node_modules/mongodb-runner/dist/mongoserver.js:393:20)
      [2026/06/01 11:39:35.851]     at async MongoServer._populateBuildInfo (/Users/ec2-user/data/mci/c44f16bfc72d2c73438bf119a4fb29a8/tmp/drivers_orchestration/mongodb-runner-6.7.1/node_modules/mongodb-runner/dist/mongoserver.js:365:30)
      [2026/06/01 11:39:35.851]     at async MongoServer._start (/Users/ec2-user/data/mci/c44f16bfc72d2c73438bf119a4fb29a8/tmp/drivers_orchestration/mongodb-runner-6.7.1/node_modules/mongodb-runner/dist/mongoserver.js:238:40)
      [2026/06/01 11:39:35.851]     at async MongoServer.start (/Users/ec2-user/data/mci/c44f16bfc72d2c73438bf119a4fb29a8/tmp/drivers_orchestration/mongodb-runner-6.7.1/node_modules/mongodb-runner/dist/mongoserver.js:120:16)
      [2026/06/01 11:39:35.851]     at async /Users/ec2-user/data/mci/c44f16bfc72d2c73438bf119a4fb29a8/tmp/drivers_orchestration/mongodb-runner-6.7.1/node_modules/mongodb-runner/dist/mongocluster.js:221:21
      [2026/06/01 11:39:35.851]     at async Promise.allSettled (index 2)
      [2026/06/01 11:39:35.851]     at async safePromiseAll (/Users/ec2-user/data/mci/c44f16bfc72d2c73438bf119a4fb29a8/tmp/drivers_orchestration/mongodb-runner-6.7.1/node_modules/mongodb-runner/dist/util.js:32:21) Error
      [2026/06/01 11:39:35.851]     at MongoServer._populateBuildInfo (/Users/ec2-user/data/mci/c44f16bfc72d2c73438bf119a4fb29a8/tmp/drivers_orchestration/mongodb-runner-6.7.1/node_modules/mongodb-runner/dist/mongoserver.js:371:90)
      [2026/06/01 11:39:35.851]     at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
      [2026/06/01 11:39:35.851]     at async MongoServer._start (/Users/ec2-user/data/mci/c44f16bfc72d2c73438bf119a4fb29a8/tmp/drivers_orchestration/mongodb-runner-6.7.1/node_modules/mongodb-runner/dist/mongoserver.js:238:40)
      [2026/06/01 11:39:35.851]     at async MongoServer.start (/Users/ec2-user/data/mci/c44f16bfc72d2c73438bf119a4fb29a8/tmp/drivers_orchestration/mongodb-runner-6.7.1/node_modules/mongodb-runner/dist/mongoserver.js:120:16)
      [2026/06/01 11:39:35.851]     at async /Users/ec2-user/data/mci/c44f16bfc72d2c73438bf119a4fb29a8/tmp/drivers_orchestration/mongodb-runner-6.7.1/node_modules/mongodb-runner/dist/mongocluster.js:221:21
      [2026/06/01 11:39:35.851]     at async Promise.allSettled (index 2)
      [2026/06/01 11:39:35.851]     at async safePromiseAll (/Users/ec2-user/data/mci/c44f16bfc72d2c73438bf119a4fb29a8/tmp/drivers_orchestration/mongodb-runner-6.7.1/node_modules/mongodb-runner/dist/util.js:32:21)
      [2026/06/01 11:39:35.851]     at async MongoCluster.start (/Users/ec2-user/data/mci/c44f16bfc72d2c73438bf119a4fb29a8/tmp/drivers_orchestration/mongodb-runner-6.7.1/node_modules/mongodb-runner/dist/mongocluster.js:219:27)
      [2026/06/01 11:39:35.851]     at async Object.start (/Users/ec2-user/data/mci/c44f16bfc72d2c73438bf119a4fb29a8/tmp/drivers_orchestration/mongodb-runner-6.7.1/node_modules/mongodb-runner/dist/runner-helpers.js:57:21)
      [2026/06/01 11:39:35.851]     at async start (/Users/ec2-user/data/mci/c44f16bfc72d2c73438bf119a4fb29a8/tmp/drivers_orchestration/mongodb-runner-6.7.1/node_modules/mongodb-runner/dist/cli.js:140:33)
      [2026/06/01 11:39:35.851] 2026-06-01T16:39:10.180Z mongodb-runner failed to get buildInfo Error: Arbiter flag mismatch -- server should be arbiter but hello indicates it is not
      [2026/06/01 11:39:35.851]     at MongoServer._ensureMatchingMetadataColl (/Users/ec2-user/data/mci/c44f16bfc72d2c73438bf119a4fb29a8/tmp/drivers_orchestration/mongodb-runner-6.7.1/node_modules/mongodb-runner/dist/mongoserver.js:313:19)
      [2026/06/01 11:39:35.851]     at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
      [2026/06/01 11:39:35.851]     at async /Users/ec2-user/data/mci/c44f16bfc72d2c73438bf119a4fb29a8/tmp/drivers_orchestration/mongodb-runner-6.7.1/node_modules/mongodb-runner/dist/mongoserver.js:366:17
      [2026/06/01 11:39:35.851]     at async MongoServer.withClient (/Users/ec2-user/data/mci/c44f16bfc72d2c73438bf119a4fb29a8/tmp/drivers_orchestration/mongodb-runner-6.7.1/node_modules/mongodb-runner/dist/mongoserver.js:393:20)
      [2026/06/01 11:39:35.851]     at async MongoServer._populateBuildInfo (/Users/ec2-user/data/mci/c44f16bfc72d2c73438bf119a4fb29a8/tmp/drivers_orchestration/mongodb-runner-6.7.1/node_modules/mongodb-runner/dist/mongoserver.js:365:30)
      [2026/06/01 11:39:35.851]     at async MongoServer._start (/Users/ec2-user/data/mci/c44f16bfc72d2c73438bf119a4fb29a8/tmp/drivers_orchestration/mongodb-runner-6.7.1/node_modules/mongodb-runner/dist/mongoserver.js:238:40)
      [2026/06/01 11:39:35.851]     at async MongoServer.start (/Users/ec2-user/data/mci/c44f16bfc72d2c73438bf119a4fb29a8/tmp/drivers_orchestration/mongodb-runner-6.7.1/node_modules/mongodb-runner/dist/mongoserver.js:120:16)
      [2026/06/01 11:39:35.851]     at async /Users/ec2-user/data/mci/c44f16bfc72d2c73438bf119a4fb29a8/tmp/drivers_orchestration/mongodb-runner-6.7.1/node_modules/mongodb-runner/dist/mongocluster.js:221:21
      [2026/06/01 11:39:35.851]     at async Promise.allSettled (index 2)
      [2026/06/01 11:39:35.851]     at async safePromiseAll (/Users/ec2-user/data/mci/c44f16bfc72d2c73438bf119a4fb29a8/tmp/drivers_orchestration/mongodb-runner-6.7.1/node_modules/mongodb-runner/dist/util.js:32:21)
      [2026/06/01 11:39:35.851] 2026-06-01T16:39:11.110Z mongodb-runner server listening on port 27018
      [2026/06/01 11:39:35.851] 2026-06-01T16:39:11.116Z mongodb-runner ensuring metadata collection entry {
      [2026/06/01 11:39:35.851]   _id: '4fb27ca4-a22f-4e52-b915-e034b49ceba4',
      [2026/06/01 11:39:35.851]   pid: 3456,
      [2026/06/01 11:39:35.851]   port: 27018,
      [2026/06/01 11:39:35.851]   dbPath: '/data/mci/c44f16bfc72d2c73438bf119a4fb29a8/tmp/drivers_orchestration/db-4fb27ca4-a22f-4e52-b915-e034b49ceba4',
      [2026/06/01 11:39:35.851]   startTime: '2026-06-01T16:38:56.151Z'
      [2026/06/01 11:39:35.851] } { isMongoS: false }
      [2026/06/01 11:39:35.851] 2026-06-01T16:39:11.211Z mongodb-runner server listening on port 27017
      [2026/06/01 11:39:35.851] 2026-06-01T16:39:11.217Z mongodb-runner ensuring metadata collection entry {
      [2026/06/01 11:39:35.851]   _id: '864df61d-33df-4fc7-927a-f03eae17407e',
      [2026/06/01 11:39:35.851]   pid: 3455,
      [2026/06/01 11:39:35.851]   port: 27017,
      [2026/06/01 11:39:35.851]   dbPath: '/data/mci/c44f16bfc72d2c73438bf119a4fb29a8/tmp/drivers_orchestration/db-864df61d-33df-4fc7-927a-f03eae17407e',
      [2026/06/01 11:39:35.851]   startTime: '2026-06-01T16:38:56.150Z'
      [2026/06/01 11:39:35.851] } { isMongoS: false }
      [2026/06/01 11:39:35.851] 2026-06-01T16:39:12.290Z mongodb-runner inserted metadata collection entry {
      [2026/06/01 11:39:35.851]   _id: '864df61d-33df-4fc7-927a-f03eae17407e',
      [2026/06/01 11:39:35.851]   pid: 3455,
      [2026/06/01 11:39:35.851]   port: 27017,
      [2026/06/01 11:39:35.851]   dbPath: '/data/mci/c44f16bfc72d2c73438bf119a4fb29a8/tmp/drivers_orchestration/db-864df61d-33df-4fc7-927a-f03eae17407e',
      [2026/06/01 11:39:35.851]   startTime: '2026-06-01T16:38:56.150Z'
      [2026/06/01 11:39:35.851] }
      [2026/06/01 11:39:35.851] 2026-06-01T16:39:12.292Z mongodb-runner got server build info through client 9.0.0-alpha0-98aef7f8 enterprise
      [2026/06/01 11:39:35.851] 2026-06-01T16:39:12.388Z mongodb-runner inserted metadata collection entry {
      [2026/06/01 11:39:35.851]   _id: '4fb27ca4-a22f-4e52-b915-e034b49ceba4',
      [2026/06/01 11:39:35.851]   pid: 3456,
      [2026/06/01 11:39:35.851]   port: 27018,
      [2026/06/01 11:39:35.851]   dbPath: '/data/mci/c44f16bfc72d2c73438bf119a4fb29a8/tmp/drivers_orchestration/db-4fb27ca4-a22f-4e52-b915-e034b49ceba4',
      [2026/06/01 11:39:35.851]   startTime: '2026-06-01T16:38:56.151Z'
      [2026/06/01 11:39:35.851] }
      [2026/06/01 11:39:35.851] 2026-06-01T16:39:12.390Z mongodb-runner got server build info through client 9.0.0-alpha0-98aef7f8 enterprise
      [2026/06/01 11:39:35.851] 2026-06-01T16:39:12.395Z mongodb-runner Running rs.initiate
      [2026/06/01 11:39:35.851] 2026-06-01T16:39:23.518Z mongodb-runner rs.status did not include primary, waiting...
      [2026/06/01 11:39:35.851] 2026-06-01T16:39:24.520Z mongodb-runner rs.status did not include primary, waiting...
      [2026/06/01 11:39:35.851] 2026-06-01T16:39:25.521Z mongodb-runner rs.status did not include primary, waiting...
      [2026/06/01 11:39:35.851] 2026-06-01T16:39:26.524Z mongodb-runner rs.status did not include primary, waiting...
      [2026/06/01 11:39:35.851] 2026-06-01T16:39:27.526Z mongodb-runner rs.status did not include primary, waiting...
      [2026/06/01 11:39:35.851] 2026-06-01T16:39:28.529Z mongodb-runner rs.status did not include primary, waiting...
      [2026/06/01 11:39:35.851] 2026-06-01T16:39:29.532Z mongodb-runner rs.status did not include primary, waiting...
      [2026/06/01 11:39:35.851] 2026-06-01T16:39:30.534Z mongodb-runner rs.status did not include primary, waiting...
      [2026/06/01 11:39:35.851] 2026-06-01T16:39:31.537Z mongodb-runner rs.status did not include primary, waiting...
      [2026/06/01 11:39:35.851] 2026-06-01T16:39:32.538Z mongodb-runner rs.status did not include primary, waiting...
      [2026/06/01 11:39:35.851] 2026-06-01T16:39:33.541Z mongodb-runner rs.status did not include primary, waiting...
      [2026/06/01 11:39:35.851] 2026-06-01T16:39:34.542Z mongodb-runner rs.status indicated primary for replset repl0 [
      [2026/06/01 11:39:35.851]   {
      [2026/06/01 11:39:35.851]     _id: 0,
      [2026/06/01 11:39:35.851]     name: 'localhost:27017',
      [2026/06/01 11:39:35.851]     health: 1,
      [2026/06/01 11:39:35.851]     state: 1,
      [2026/06/01 11:39:35.851]     stateStr: 'PRIMARY',
      [2026/06/01 11:39:35.851]     uptime: 38,
      [2026/06/01 11:39:35.851]     optime: { ts: new Timestamp({ t: 1780331959, i: 1 }), t: -1 },
      [2026/06/01 11:39:35.851]     optimeDate: 2026-06-01T16:39:19.000Z,
      [2026/06/01 11:39:35.851]     optimeWritten: { ts: new Timestamp({ t: 1780331959, i: 1 }), t: -1 },
      [2026/06/01 11:39:35.851]     optimeWrittenDate: 2026-06-01T16:39:19.000Z,
      [2026/06/01 11:39:35.851]     lastAppliedWallTime: 2026-06-01T16:39:19.022Z,
      [2026/06/01 11:39:35.851]     lastDurableWallTime: 2026-06-01T16:39:19.022Z,
      [2026/06/01 11:39:35.851]     lastWrittenWallTime: 2026-06-01T16:39:19.022Z,
      [2026/06/01 11:39:35.851]     lastStableRecoveryTimestamp: new Timestamp({ t: 1780331959, i: 1 }),
      [2026/06/01 11:39:35.851]     syncSourceHost: '',
      [2026/06/01 11:39:35.851]     syncSourceId: -1,
      [2026/06/01 11:39:35.851]     infoMessage: 'Could not find member to sync from',
      [2026/06/01 11:39:35.852]     electionTime: new Timestamp({ t: 1780331974, i: 1 }),
      [2026/06/01 11:39:35.852]     electionDate: 2026-06-01T16:39:34.000Z,
      [2026/06/01 11:39:35.852]     configVersion: 1,
      [2026/06/01 11:39:35.852]     configTerm: 0,
      [2026/06/01 11:39:35.852]     self: true,
      [2026/06/01 11:39:35.852]     lastHeartbeatMessage: ''
      [2026/06/01 11:39:35.852]   },
      [2026/06/01 11:39:35.852]   {
      [2026/06/01 11:39:35.852]     _id: 1,
      [2026/06/01 11:39:35.852]     name: 'localhost:27018',
      [2026/06/01 11:39:35.852]     health: 1,
      [2026/06/01 11:39:35.852]     state: 2,
      [2026/06/01 11:39:35.852]     stateStr: 'SECONDARY',
      [2026/06/01 11:39:35.852]     uptime: 12,
      [2026/06/01 11:39:35.852]     optime: { ts: new Timestamp({ t: 1780331959, i: 1 }), t: -1 },
      [2026/06/01 11:39:35.852]     optimeDurable: { ts: new Timestamp({ t: 0, i: 0 }), t: -1 },
      [2026/06/01 11:39:35.852]     optimeWritten: { ts: new Timestamp({ t: 1780331959, i: 1 }), t: -1 },
      [2026/06/01 11:39:35.852]     optimeDate: 2026-06-01T16:39:19.000Z,
      [2026/06/01 11:39:35.852]     optimeDurableDate: 1970-01-01T00:00:00.000Z,
      [2026/06/01 11:39:35.852]     optimeWrittenDate: 2026-06-01T16:39:19.000Z,
      [2026/06/01 11:39:35.852]     lastAppliedWallTime: 2026-06-01T16:39:19.022Z,
      [2026/06/01 11:39:35.852]     lastDurableWallTime: 1970-01-01T00:00:00.000Z,
      [2026/06/01 11:39:35.852]     lastWrittenWallTime: 2026-06-01T16:39:19.022Z,
      [2026/06/01 11:39:35.852]     lastStableRecoveryTimestamp: new Timestamp({ t: 0, i: 0 }),
      [2026/06/01 11:39:35.852]     lastHeartbeat: 2026-06-01T16:39:34.458Z,
      [2026/06/01 11:39:35.852]     lastHeartbeatRecv: 2026-06-01T16:39:34.458Z,
      [2026/06/01 11:39:35.852]     pingMs: 0,
      [2026/06/01 11:39:35.852]     lastHeartbeatMessage: '',
      [2026/06/01 11:39:35.852]     syncSourceHost: '',
      [2026/06/01 11:39:35.852]     syncSourceId: -1,
      [2026/06/01 11:39:35.852]     infoMessage: '',
      [2026/06/01 11:39:35.852]     configVersion: 1,
      [2026/06/01 11:39:35.852]     configTerm: 0
      [2026/06/01 11:39:35.852]   },
      [2026/06/01 11:39:35.852]   {
      [2026/06/01 11:39:35.852]     _id: 2,
      [2026/06/01 11:39:35.852]     name: 'localhost:27019',
      [2026/06/01 11:39:35.852]     health: 1,
      [2026/06/01 11:39:35.852]     state: 7,
      [2026/06/01 11:39:35.852]     stateStr: 'ARBITER',
      [2026/06/01 11:39:35.852]     uptime: 12,
      [2026/06/01 11:39:35.852]     lastHeartbeat: 2026-06-01T16:39:34.459Z,
      [2026/06/01 11:39:35.852]     lastHeartbeatRecv: 2026-06-01T16:39:34.459Z,
      [2026/06/01 11:39:35.852]     pingMs: 0,
      [2026/06/01 11:39:35.852]     lastHeartbeatMessage: '',
      [2026/06/01 11:39:35.852]     syncSourceHost: '',
      [2026/06/01 11:39:35.852]     syncSourceId: -1,
      [2026/06/01 11:39:35.852]     infoMessage: '',
      [2026/06/01 11:39:35.852]     configVersion: 1,
      [2026/06/01 11:39:35.852]     configTerm: 0
      [2026/06/01 11:39:35.852]   }
      [2026/06/01 11:39:35.852] ]
      

            Assignee:
            Steve Silvester
            Reporter:
            Steve Silvester
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated: