[SERVER-85857] Avoid closing the connections with ARS destruction by waiting for all responses Created: 29/Jan/24  Updated: 02/Feb/24  Resolved: 02/Feb/24

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 8.0.0-rc0

Type: Task Priority: Major - P3
Reporter: Abdul Qadeer Assignee: Abdul Qadeer
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Backwards Compatibility: Fully Compatible
Sprint: Cluster Scalability 2024-2-5
Participants:
Linked BF Score: 2

 Description   

SERVER-80838 made it so that any pending child writes of parent writes of type WWSKWID are canceled (by breaking out of the ARS execution loop) with ErrorCodes::CallBackCanceled as per the default behavior of mongo RPC. This causes the connections to those shards to be closed in the ConnectionPool dropping the number of active connections. When a new request comes in to connect to those shards, more CPU time is spent establishing SSL/TLS connection impacting performance during higher write rates.

One approach is to cancel the requests out of band by sending a killOps and cancel any pending write operation at mongod - but even that takes a network RTT to complete. It would be better to wait for the pending shards' responses and discard them without processing. We would still be faster the majority of the time when shards are co-located in the same zone. In the cases where the shards are in other zones that take a longer time to respond, we wouldn't be performing any worse than what it was earlier.



 Comments   
Comment by Githook User [ 02/Feb/24 ]

Author:

{'name': 'Abdul Qadeer', 'email': 'abdul.qadeer@mongodb.com', 'username': 'zorro786'}

Message: SERVER-85857 Wait for ARS responses without processing them for WriteWithoutShardKeyWithId (#18499)

GitOrigin-RevId: abb4172161ca73618e9489d70a04476e2519779f
Branch: master
https://github.com/mongodb/mongo/commit/217261c88f10bf6d1ec99f77e260d712ce378122

Generated at Thu Feb 08 06:58:47 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.