[SERVER-48264] ShardServerCatalogCacheLoader doesn't handle threadpool shutdown Created: 18/May/20  Updated: 29/Oct/23  Resolved: 03/Aug/20

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: 4.7.0

Type: Bug Priority: Major - P3
Reporter: Kevin Pulo Assignee: Randolph Tan
Resolution: Fixed Votes: 0
Labels: sharding-wfbf-day
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Related
is related to SERVER-47972 maxTimeMS set on hedged requests does... Closed
is related to SERVER-39965 Make OutOfLineExecutor return a Statu... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: Sharding 2020-08-10
Participants:
Linked BF Score: 20

 Description   

The ShardServerCatalogCacheLoader lambdas when scheduling work on the _threadPool immediately invariant that the status is OK. This is incorrect, because it means that they can't handle the situation where the threadpool is shutting down,which results in the lambda being immediately calling in-line with a status of ShutdownInProgress.

These lambdas should instead handle this situation properly, eg. by changing the invariant to a uassert and moving it inside the try block, or how other ThreadPool users do it.



 Comments   
Comment by Githook User [ 01/Aug/20 ]

Author:

{'name': 'Randolph Tan', 'email': 'randolph@10gen.com', 'username': 'renctan'}

Message: SERVER-48264 Don't invariant when task is cancelled by the executor
Branch: master
https://github.com/mongodb/mongo/commit/b9353930bcb5387857620f1d45fb87b79f4a0064

Comment by Benjamin Caimano (Inactive) [ 01/Jun/20 ]

kaloian.manassiev, I think I'm with kevin.pulo here. Why can't we quick return before the try block here if the error code is a ShutdownError?

Comment by Kaloian Manassiev [ 25/May/20 ]

kevin.pulo, if swCollAndChunks is not-OK, then the refreshCollectionRoutingInfo call will throw, which will result in the mutex acquisition. This will be circular with this mutex acquisition.

So, the only proper way to fix this would be to convert the CatalogCacheLoader to use Futures, but if we will do that we should just wait for SERVER-46199.

Comment by Esha Maharishi (Inactive) [ 19/May/20 ]

Note that we encountered the same issue of scheduling onto the ShardServerCatalogCacheLoader after it was shut down under SERVER-47482.

Generated at Thu Feb 08 05:16:37 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.