Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-42228

LoggerRuntimeConfigError exceptions can lead to background dbhash thread running until Evergreen task times out

    XMLWordPrintable

    Details

    • Backwards Compatibility:
      Fully Compatible
    • Operating System:
      ALL
    • Backport Requested:
      v4.2, v4.0, v3.6
    • Steps To Reproduce:
      Hide

      Apply the following patch and run a test suite which uses the CheckReplDBHashInBackground hook:

      python buildscripts/resmoke.py --suite=replica_sets_jscore_passthrough jstests/core/apitest_dbcollection.js --log=buildlogger
      

      diff --git a/buildscripts/resmoke.py b/buildscripts/resmoke.py
      index db33d8b95e..82e7d0aa7c 100755
      --- a/buildscripts/resmoke.py
      +++ b/buildscripts/resmoke.py
      @@ -64,6 +64,8 @@ class Resmoke(object):  # pylint: disable=too-many-instance-attributes
           def _exit_logging(self):
               if not self._interrupted:
                   logging.flush.stop_thread()
      +            # Sleep here to simulate logging being slow.
      +            time.sleep(60)
                   self._exit_on_incomplete_logging()
       
           def _exit_on_incomplete_logging(self):
      diff --git a/buildscripts/resmokelib/testing/report.py b/buildscripts/resmokelib/testing/report.py
      index 3365a17075..436554dfe1 100644
      --- a/buildscripts/resmokelib/testing/report.py
      +++ b/buildscripts/resmokelib/testing/report.py
      @@ -9,6 +9,7 @@ import time
       import unittest
       
       from .. import config as _config
      +from .. import errors
       from .. import logging
       
       
      @@ -107,6 +108,12 @@ class TestReport(unittest.TestResult):  # pylint: disable=too-many-instance-attr
                   if test.dynamic:
                       self.num_dynamic += 1
       
      +        if not test.test_name.endswith(("fixture_setup", "CheckReplDBHashInBackground")):
      +            # Raise an error.LoggerRuntimeConfigError exception to simulate logkeeper being
      +            # overwhelmed when running a test after the fixture has been set up. The goal is to have
      +            # the background dbhash thread run but not the actual test file.
      +            raise errors.LoggerRuntimeConfigError("Intentionally triggered!")
      +
               # Set up the test-specific logger.
               test_logger = self.job_logger.new_test_logger(test.short_name(), test.basename(), command,
                                                             test.logger)
      

      Show
      Apply the following patch and run a test suite which uses the CheckReplDBHashInBackground hook: python buildscripts/resmoke.py --suite=replica_sets_jscore_passthrough jstests/core/apitest_dbcollection.js --log=buildlogger diff --git a/buildscripts/resmoke.py b/buildscripts/resmoke.py index db33d8b95e..82e7d0aa7c 100755 --- a/buildscripts/resmoke.py +++ b/buildscripts/resmoke.py @@ -64,6 +64,8 @@ class Resmoke(object): # pylint: disable=too-many-instance-attributes def _exit_logging(self): if not self._interrupted: logging.flush.stop_thread() + # Sleep here to simulate logging being slow. + time.sleep(60) self._exit_on_incomplete_logging()   def _exit_on_incomplete_logging(self): diff --git a/buildscripts/resmokelib/testing/report.py b/buildscripts/resmokelib/testing/report.py index 3365a17075..436554dfe1 100644 --- a/buildscripts/resmokelib/testing/report.py +++ b/buildscripts/resmokelib/testing/report.py @@ -9,6 +9,7 @@ import time import unittest   from .. import config as _config +from .. import errors from .. import logging     @@ -107,6 +108,12 @@ class TestReport(unittest.TestResult): # pylint: disable=too-many-instance-attr if test.dynamic: self.num_dynamic += 1   + if not test.test_name.endswith(("fixture_setup", "CheckReplDBHashInBackground")): + # Raise an error.LoggerRuntimeConfigError exception to simulate logkeeper being + # overwhelmed when running a test after the fixture has been set up. The goal is to have + # the background dbhash thread run but not the actual test file. + raise errors.LoggerRuntimeConfigError("Intentionally triggered!") + # Set up the test-specific logger. test_logger = self.job_logger.new_test_logger(test.short_name(), test.basename(), command, test.logger)
    • Sprint:
      STM 2019-07-29
    • Linked BF Score:
      11
    • Story Points:
      2

      Description

      If TestReport.startTest() raises an exception, then Job._run_hooks_after_tests() won't be called. For test suites which use the CheckReplDBHashInBackground hook, this leads to the background thread continuing to spawn mongo shell processes and running the run_check_repl_dbhash_background.js hook. If logkeeper is overwhelmed, then an errors.LoggerRuntimeConfigError exception can also occur when attempting to tear the fixture down. This leads the Evergreen task to time out instead of failing with code 75 because the background dbhash check will continue to run while the fixture is still running and resmoke.py's flush thread will therefore never exit.

      We don't want to always run the after_test() method for a hook though. For example, if running a test crashes the server, then we shouldn't attempt to run any data consistency checks because they'll just fail to connect to the downed server.

        Attachments

          Issue Links

            Activity

              People

              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: