Flaky sharded transactions mongos-pin-auto "remain pinned after non-transient" unified tests

XMLWordPrintableJSON

    • Type: Bug
    • Resolution: Unresolved
    • Priority: Unknown
    • None
    • Affects Version/s: None
    • Component/s: None
    • Ruby Drivers
    • None
    • None
    • None
    • None
    • None
    • None

      Summary

      The unified tests named "remain pinned after non-transient <error> on <op>" in spec/spec_tests/data/unified/valid-pass/poc-transactions-mongos-pin-auto.yml (sharded-only) flake intermittently on CI.

      Observed failures

      Patch 69e734ce… — remain pinned after non-transient Interrupted error on insertOne:

      • fle_mongodb-version~8.0_topology~sharded-cluster...ruby-3.4
      • mongo-recent__mongodb-version~8.0_topology~sharded-cluster_ruby~ruby-3.2

      The test injects a failCommand failpoint (errorCode 11601 / Interrupted) on insert, then expects the resulting error to omit TransientTransactionError and UnknownTransactionCommitResult labels. Under CI load the server may inject its own transient transaction condition (catalog change, primary step-down) around the same time, which flips the expected labels or reorders the observed events.

      Mitigation

      spec/runners/unified/test.rb now matches the test description pattern remain pinned after non-transient in Unified::Test#retry?, so on CI the unified runner retries these tests up to 3 times via retry_test.

      Follow-up

      If retries don't stabilize these tests, skip the whole poc-transactions-mongos-pin-auto.yml file on sharded, following the approach used for poc-transactions-convenient-api.yml (see RUBY-3806).

            Assignee:
            Unassigned
            Reporter:
            Dmitry Rybakov
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated: