Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-47972

maxTimeMS set on hedged requests does not give shards enough time to refresh

    XMLWordPrintable

Details

    • Bug
    • Status: Closed
    • Major - P3
    • Resolution: Fixed
    • None
    • 4.4.0-rc7, 4.7.0
    • None
    • None
    • Fully Compatible
    • ALL
    • v4.4
    • Service arch 2020-05-18
    • 28

    Description

      Goal: investigate if the scenario outlined below is possible and determine a fix. 

       

      Scenario:

      Suppose shard0 has a primary, secondary0, and secondary1.

      • Each time mongos tries to perform 'count' with hedged reads, secondary0 never gets the chance to set the in memory database version because it either gets killedOp'd or timed out of maxTimeMS before it can be set successfully after refreshing.
      • Over the NetworkInterfaceTL, mongos tries to route the next 'count' command to secondary0. Secondary0 has no known databaseVersion, so onDbVersionMismatchNoExcept gets called. A refresh is prompted and errors with maxTimeMS expiration error.
      • Since the maxTimeMS error is ignored, the original "no known dbVersion" error propagates back to the NetworkInterfaceTL. There, since the error reported is not maxTimeMS, the finish line is triggered. Secondary0 wins the race, but the 'count' fails with "don't know dbVersion."

      In this scenario, we believe secondary0 would be getting killed here.

      Attachments

        Issue Links

          Activity

            People

              cheahuychou.mao@mongodb.com Cheahuychou Mao
              haley.connelly@mongodb.com Haley Connelly
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: