reshard_collection_basic.js fails in burn_in:sharded_retryable_writes_downgrade suites on implicit multiversion {A,UB}SAN variants

XMLWordPrintableJSON

    • Type: Bug
    • Resolution: Unresolved
    • Priority: Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • Cluster Scalability
    • ALL
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      To reproduce failures: Remove incompatible_aubsan tag from jstests/core_sharding/resharding/reshard_collection_basic.js

      In the burn_in suites for sharded_retryable_writes_downgrade running on [jstests_affected] * Shared Library {A,UB}SAN Enterprise RHEL 8 DEBUG Experimental (all feature flags) and [jstests_affected] * Shared Library {A,UB}SAN Enterprise RHEL 8 DEBUG Experimental there are system failures and timeouts for reshard_collection_basic.js. Many of the timeouts follow a similar pattern:

      • During one of the success cases in reshard_collection_basic.js, resharding proceeds while the ContinuousStepdown hook steps down replica set primaries in the background.
      • Resharding completes for that test case.
      • Following this, the thread running the ContinuousStepdown hook throws an exception due to timing out waiting for a primary to step up on one of the replica sets.
      • After this occurs, the next test cases in reshard_collection_basic.js run until hitting the evergreen timeout. Resharding appears to complete normally.

      It seems likely that this behavior is somehow elicited by the slowness of the variant, since it's running with {A,UB}SAN and the same test/suite running on the RHEL 8 variant without {A,UB}SAN passes. It seems possible that resharding itself is working fine, but that whatever is causing the ContinuousStepdown hook to time out is taking long enough that the test runs out of time to complete.

      Failing patch
      Filtered timeout logs

       

            Assignee:
            Unassigned
            Reporter:
            Natalie Hill
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated: