Details
-
Bug
-
Resolution: Won't Do
-
Major - P3
-
None
-
None
-
None
-
Serverless
Description
In testUnblockBlockedReadsAfterMigrationAborted (and possibly other test cases) there is a possible race condition that would lead to a deadlock, and the test does not really work as intended. We want to run some commands on the primary & secondaries during the migration, then unblock the migration and check the result.
We start by getting the primary and starting a migration (with a failpoint to stall it). Then start a resume thread that will unblock the migration once the tenant access blocker (on the primary only) will see a read of 1. In the test we run a command on each node in sequence. As the first node is normally the primary, the following will happen :
- Do the read on the primary. This blocks the test thread.
- Read count increased by 1 for the tenant Id on the primary
- Resume thread sees the increase on the primary, it unblocks the migration,
- The migration completes.
- The read command returns in the main thread, it checks the return value.
- The read command is done on each secondary, that does not block.
If due to a stepdown or another issue the first node is not the primary, the test will hang as the read count on the primary won't increase. We can test it with the following modification :
const db = nodes[1].getDB(dbName); runCommand(db, command, null, testCase.isTransaction); |
// nodes.forEach(node => { // const db = node.getDB(dbName); // runCommand(db, command, null, testCase.isTransaction); // }); |
|