[SERVER-60586] out_max_time_ms.js does not correctly enable "maxTimeNeverTimeOut" failpoint leading to spurious test failure Created: 08/Oct/21  Updated: 29/Oct/23  Resolved: 14/Oct/21

Status: Closed
Project: Core Server
Component/s: Query Execution
Affects Version/s: None
Fix Version/s: 5.2.0, 4.4.11, 5.0.4, 5.1.0-rc2

Type: Bug Priority: Major - P3
Reporter: David Storch Assignee: David Storch
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
Related
is related to SERVER-45969 Tests for killOp and maxTimeMS with $... Closed
is related to SERVER-58855 Improve/Fix the Race Condition in out... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v5.1, v5.0, v4.4
Sprint: QE 2021-10-18
Participants:
Linked BF Score: 37

 Description   

The test out_max_time_ms.js runs some test logic against both a standalone node as well as a two-node replica set. The test also depends on enabling the "maxTimeNeverTimeOut" failpoint on the nodes involved in the test. This is done in order to prevent operations with a maxTimeMS from timing out prematurely.

In the case of testing against a replica set, however, there are scenarios in which the test can fail to enable "maxTimeNeverTimeOut" against all of the nodes in the replica set. In particular, this assertion passes the same connection three times to the forceAggregationToHangAndCheckMaxTimeMsExpires(). The helper function then enables the failpoint on the passed in connections. But if both of these connections are to the secondary node, for example, then the failpoint is left disabled on the primary. This can result in operations timing out prematurely, which in turn can cause the test to hang for 10 minutes and fail on an assert.soon() here.

We should change the test logic so that it unconditionally enables/disables the failpoint for both nodes in the replica set, making it impossible to leave any failpoint in the wrong state.



 Comments   
Comment by Githook User [ 19/Oct/21 ]

Author:

{'name': 'David Storch', 'email': 'david.storch@mongodb.com', 'username': 'dstorch'}

Message: SERVER-60586 Fix out_max_time_ms.js to correctly enable 'maxTimeNeverTimeOut' failpoint

(cherry picked from commit fb5fc15108425209fe8b5fb4a33e45e7980214b3)
(cherry picked from commit 96ca0c19edd8673b7851192c33edc1400a23844e)
(cherry picked from commit 48d228aa83ce8c5f36725c5871337d9bc455bc69)
Branch: v4.4
https://github.com/mongodb/mongo/commit/0aa523559cd68b32f025a8127272ddfbf8a1a30f

Comment by Githook User [ 19/Oct/21 ]

Author:

{'name': 'David Storch', 'email': 'david.storch@mongodb.com', 'username': 'dstorch'}

Message: SERVER-60586 Fix out_max_time_ms.js to correctly enable 'maxTimeNeverTimeOut' failpoint

(cherry picked from commit fb5fc15108425209fe8b5fb4a33e45e7980214b3)
(cherry picked from commit 96ca0c19edd8673b7851192c33edc1400a23844e)
Branch: v5.0
https://github.com/mongodb/mongo/commit/48d228aa83ce8c5f36725c5871337d9bc455bc69

Comment by Githook User [ 19/Oct/21 ]

Author:

{'name': 'David Storch', 'email': 'david.storch@mongodb.com', 'username': 'dstorch'}

Message: SERVER-60586 Fix out_max_time_ms.js to correctly enable 'maxTimeNeverTimeOut' failpoint

(cherry picked from commit fb5fc15108425209fe8b5fb4a33e45e7980214b3)
Branch: v5.1
https://github.com/mongodb/mongo/commit/96ca0c19edd8673b7851192c33edc1400a23844e

Comment by Githook User [ 13/Oct/21 ]

Author:

{'name': 'David Storch', 'email': 'david.storch@mongodb.com', 'username': 'dstorch'}

Message: SERVER-60586 Fix out_max_time_ms.js to correctly enable 'maxTimeNeverTimeOut' failpoint
Branch: master
https://github.com/mongodb/mongo/commit/fb5fc15108425209fe8b5fb4a33e45e7980214b3

Comment by David Storch [ 08/Oct/21 ]

It looks to me like this problem was first introduced by the changes to the test done under SERVER-45969. Therefore, the earliest branch that the problem exists in is 4.4. We should backport the fix to 5.1, 5.0, and 4.4.

Generated at Thu Feb 08 05:50:13 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.