[SERVER-62217] inject_tenant_prefix.js override may not wait for newly elected primary after tenant migration commits Created: 22/Dec/21 Updated: 29/Oct/23 Resolved: 21/Jan/22 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Testing Infrastructure |
| Affects Version/s: | None |
| Fix Version/s: | 5.3.0 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Max Hirschhorn | Assignee: | Hugh Tong (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||
| Operating System: | ALL | ||||||||
| Backport Requested: |
v5.2, v5.0
|
||||||||
| Sprint: | Server Serverless 2022-01-10, Server Serverless 2022-01-24 | ||||||||
| Participants: | |||||||||
| Linked BF Score: | 15 | ||||||||
| Description |
|
The inject_tenant_prefix.js override uses a Mongo.prototype.runCommand() override to send a command over the reroutingMongo replica set connection. However, Mongo.prototype._markNodeAsFailed() isn't being overridden by this file, so when the command fails with a retryable error from the reroutingMongo replica set connection, _markNodeAsFailed() is still called on the original replica set connection. The RSM for the original replica set connection ignores the error and won't attempt to discover a new primary because the host isn't actually part of that replica set. This leads the JavaScript test to continue its retries unsuccessfully against the stale primary of the rerouting replica set connection and ultimately fail the test.
|
| Comments |
| Comment by Githook User [ 03/Feb/22 ] | |||||||||||||||||||||||||||||
|
Author: {'name': 'Hugh Tong', 'email': 'hugh.tong@mongodb.com', 'username': 'cortrain'}Message: | |||||||||||||||||||||||||||||
| Comment by Hugh Tong (Inactive) [ 18/Jan/22 ] | |||||||||||||||||||||||||||||
|
Completed on 5.3.0 | |||||||||||||||||||||||||||||
| Comment by Githook User [ 14/Jan/22 ] | |||||||||||||||||||||||||||||
|
Author: {'name': 'Hugh Tong', 'email': 'hugh.tong@mongodb.com', 'username': 'cortrain'}Message: | |||||||||||||||||||||||||||||
| Comment by Hugh Tong (Inactive) [ 13/Jan/22 ] | |||||||||||||||||||||||||||||
|
Created an override in inject_tenant_prefix.js, and performed evergreen tests. In one of the instances of the tenant_migration_terminate_primary_jscore_passthrough test suite, we can see the following sequence of events:
In this test, we see an insert ("txnNumber":784) being applied on j0:rs1:n2, then failing during a primary stepdown on rs1. If _markNodeAsFailed() were successful, j0:rs1:n2 should be marked as failed, and the operation retried on the new primary in the next attempt on the same txnNumber. Following the failure, we see the operation being applied to j0:rs1:n0 ("txnNumber":784), and after that the Slow Query response with "ninserted":1, indicating a successful insert.
|