Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-36988

awaitdata_getmore_cmd.js times out when run concurrently with the LogicalSessionCache refresh suite

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Fixed
    • Affects Version/s: 4.1.2
    • Fix Version/s: 3.6.9, 4.0.4, 4.1.4
    • Component/s: None
    • Labels:
      None

      Description

      awaitdata_getmore_cmd.js consistently stalls when run concurrently with a LogicalSessionCache refresh of 100ms.

      Further investigation to follow. This is a placeholder ticket to track the fix for a future blacklist on LogicalSessionCache refresh suites.

      Investigation

      A couple things are happening here:

      1. awaitdata_getmore_cmd.js tails the oplog and issues getMores against it. When it does so, it waits until the cursor's batch size is equal to zero.
      2. CheckReplDBHashInBackground continually runs and creates sessions to check db hashes.
      3. The logical session cache refresh will flush these sessions to disk, creating new oplog entries.
      4. The getMore batch size will never be equal to zero in more aggressive (faster refresh – 100ms is fastest) logical session cache suites.
      5. The while loop will run indefinitely until the rare condition where the cursor is able to pull an empty batch before CheckReplDBHashInBackground can add a new session.

      Proposed Fix

      The test expects an exact number. Relaxing the constraints would risk correctness on other suites. Unless it would make sense to conditionally run the while loop only if the logical session cache isn't running, then we should blacklist this test from logical session cache suites.

        Attachments

          Activity

            People

            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: