Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-34537

change_streams_shards_start_in_sync.js relies on ARS ordering

    XMLWordPrintableJSON

Details

    • Icon: Bug Bug
    • Resolution: Gone away
    • Icon: Major - P3 Major - P3
    • None
    • None
    • Internal Code
    • None
    • Service Arch
    • ALL

    Description

      The change_streams_shards_start_in_sync.js test relies on the order in which the ARS runs commands against shards in order to succeed.

      In particular, it:

      1. Uses mongobridge to disconnect one shard (Process A)
      2. Starts a changestream via a mongos (Process B)
      3. Waits for the changestream to start on the other shards (Process A)
      4. connects the shard (Process A)
      5. Now that the shard is connected, (Process B) finishes

      Unfortunately, this relies on ordering in the AsyncRequestsSender. Because the ARS construction looks like:
      For each request, use the ReplicaSetMonitor to target the request to a particular host, then call scheduleRemoteCommand. targeting via the rsm is a blocking operation.

      Because of the mongobridge disconnect, this means that if you happen to target the disconnected shard before starting the other changestream, the test will hang for 20 seconds in targeting before failing.

      One option would be to rewrite the replica set monitor to be fully async, at which point the order of targeting wouldn't matter.

      Attachments

        Activity

          People

            backlog-server-servicearch Backlog - Service Architecture
            mira.carey@mongodb.com Mira Carey
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: