Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-59686

Investigate failing parallel shell in jstests/sharding/txn_two_phase_commit_coordinator_shutdown_and_restart.js

    • Type: Icon: Bug Bug
    • Resolution: Unresolved
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • Cluster Scalability
    • ALL
    • Hide

      Remove checkExitSuccess: false from jstests/sharding/txn_two_phase_commit_coordinator_shutdown_and_restart.js:159 and run it.

      Show
      Remove checkExitSuccess: false from  jstests/sharding/txn_two_phase_commit_coordinator_shutdown_and_restart.js:159 and run it.

      SERVER-58200 modified startParallelShell to ensure that whenever a parallel shell is started, the cleanup function returned by startParallelShell must be called in that shell, or the whole test will fail.

      As a part of this ticket, we modified the jstests/sharding/txn_two_phase_commit_coordinator_shutdown_and_restart.js
      to run the function, which checks the exit code of the parallel shell, and throws an error if it's not 0.

      It began to fail with the following assertion:

      [js_test:txn_two_phase_commit_coordinator_shutdown_and_restart] uncaught exception: Error: [0] != [252] are not equal : encountered an error in the parallel shell :
      [js_test:txn_two_phase_commit_coordinator_shutdown_and_restart] doassert@src/mongo/shell/assert.js:20:14
      [js_test:txn_two_phase_commit_coordinator_shutdown_and_restart] assert.eq@src/mongo/shell/assert.js:179:9
      [js_test:txn_two_phase_commit_coordinator_shutdown_and_restart] startParallelShell/<@src/mongo/shell/servers_misc.js:182:13
      [js_test:txn_two_phase_commit_coordinator_shutdown_and_restart] @jstests/sharding/txn_two_phase_commit_coordinator_shutdown_and_restart.js:157:1
      [js_test:txn_two_phase_commit_coordinator_shutdown_and_restart] @jstests/sharding/txn_two_phase_commit_coordinator_shutdown_and_restart.js:18:2
      

      The parallel shell is launched in jstests/sharding/txn_two_phase_commit_coordinator_shutdown_and_restart.js:157. (Off commit e502f2d3965ac4147d303e956a582b7c4eef8232) Here's the whole stack trace of when the parallel shell is launched:

      [js_test:txn_two_phase_commit_coordinator_shutdown_and_restart] startParallelShell/<@src/mongo/shell/servers_misc.js:182:13
      [js_test:txn_two_phase_commit_coordinator_shutdown_and_restart] @jstests/sharding/txn_two_phase_commit_coordinator_shutdown_and_restart.js:157:1
      [js_test:txn_two_phase_commit_coordinator_shutdown_and_restart] @jstests/sharding/txn_two_phase_commit_coordinator_shutdown_and_restart.js:18:2
      

      STM has set the checkExitSuccess flag to false on the cleanup function of the parallel shell to prevent the error from causing these tests to go red and to preserve existing semantics. We'd like someone to investigate if the parallel shell failure is expected (in which case checkExitSuccess should remain false), or if it's unexpected and the test needs to be modified. 

            Assignee:
            backlog-server-cluster-scalability [DO NOT USE] Backlog - Cluster Scalability
            Reporter:
            richard.samuels@mongodb.com Richard Samuels (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated: