[SERVER-73129] Tenant migration hook retryability fails to select new servers after election Created: 20/Jan/23  Updated: 29/Oct/23  Resolved: 07/Apr/23

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 7.0.0-rc0

Type: Bug Priority: Major - P3
Reporter: Matt Broadstone Assignee: Matt Broadstone
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Related
related to SERVER-73445 Use driver retryability in shard merg... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: Server Serverless 2023-01-23, Server Serverless 2023-02-06, Server Serverless 2023-02-20, Server Serverless 2023-03-06, Server Serverless 2023-03-20, Server Serverless 2023-04-03, Server Serverless 2023-04-17
Participants:
Linked BF Score: 5

 Description   

The tenant migration hook and fixture implement retryability outside of the driver which is now subtly broken after the PyMongo 4 upgrade.

def _wait_for_reroute_or_test_completion(self, migration_opts):
    start_time = time.time()
    donor_primary = migration_opts.get_donor_primary()
 
    while not self.__lifecycle.is_test_finished():
        try:
            donor_primary_client = self._create_client(donor_primary)
            ...
        except (pymongo.errors.AutoReconnect, pymongo.errors.NotPrimaryError):
            donor_primary = migration_opts.get_donor_primary()
            continue
        except pymongo.errors.PyMongoError:
            raise
        time.sleep(self.POLL_INTERVAL_SECS)

The above code will select the MongoDFixture of the current primary at operation start, create a client to it, and then attempt to run find commands against it. The change after SERVER-61794 is that a connection to that fixture uses a connection string with directConnection=true in it, disabling any automatic discovery the driver might do. If the selected node fails with a non-retryable error, we will rethrow that error instead of trying to select the primary.

We should remove the custom implementation of retryability, and instead depend on the driver's implementation (which we accidentally already depended on).



 Comments   
Comment by Githook User [ 06/Apr/23 ]

Author:

{'name': 'Matt Broadstone', 'email': 'mbroadst@mongodb.com', 'username': 'mbroadst'}

Message: SERVER-73129 Use driver retryability in tenant migration hook
Branch: master
https://github.com/mongodb/mongo/commit/579a14858a79058e9ed6adeac98274b3423a4d4f

Generated at Thu Feb 08 06:23:45 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.