[SERVER-35755] CollectionLock acquisition in shard_filtering_metadata_refresh.cpp can cause server to terminate on stepdown Created: 22/Jun/18  Updated: 29/Oct/23  Resolved: 27/Aug/18

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: 4.0.0, 4.1.1
Fix Version/s: 4.0.3, 4.1.3

Type: Bug Priority: Major - P3
Reporter: Esha Maharishi (Inactive) Assignee: Esha Maharishi (Inactive)
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
Duplicate
is duplicated by SERVER-35923 onShardVersionMismatch must catch exc... Closed
is duplicated by SERVER-35924 AutoGetCollection can throw an unhand... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v4.0
Sprint: Sharding 2018-08-13, Sharding 2018-08-27, Sharding 2018-09-10
Participants:
Linked BF Score: 25

 Description   

The CollectionLock acquisition in onShardVersionMismatch() can throw (for example, due to interrupt on stepdown), but it's called from the catch block in service_entry_point_common.cpp (and there is no try/catch above this point), so if it throws the exception will terminate the server.

We could either put an UninterruptibleLockGuard or try/catch around the lock acquisition; it may be better to put a try/catch, since we probably don't want to block stepdown for this.



 Comments   
Comment by Githook User [ 28/Aug/18 ]

Author:

{'name': 'Esha Maharishi', 'email': 'esha.maharishi@mongodb.com', 'username': 'EshaMaharishi'}

Message: SERVER-35755 CollectionLock acquisition in shard_filtering_metadata_refresh.cpp can cause server to terminate on stepdown
Branch: v4.0
https://github.com/mongodb/mongo/commit/831a5f61131331cbc259efab66200d2872ad08ec

Comment by Esha Maharishi (Inactive) [ 27/Aug/18 ]

Author:

{'name': 'Esha Maharishi', 'email': 'esha.maharishi@mongodb.com', 'username': 'EshaMaharishi'}

Message: SERVER-36755 CollectionLock acquisition in shard_filtering_metadata_refresh.cpp can cause server to terminate on stepdown (2/2)
Branch: master
https://github.com/mongodb/mongo/commit/e338bbd75f4697a9ec29f097436358282d75c5b3

Comment by Esha Maharishi (Inactive) [ 27/Aug/18 ]

Author:

{'name': 'Esha Maharishi', 'email': 'esha.maharishi@mongodb.com', 'username': 'EshaMaharishi'}

Message: SERVER-36755 CollectionLock acquisition in shard_filtering_metadata_refresh.cpp can cause server to terminate on stepdown (1/2)
Branch: master
https://github.com/mongodb/mongo/commit/2aede7ad2fce2616e4140f2ae398e4e570c84703

Comment by Esha Maharishi (Inactive) [ 27/Aug/18 ]

Code review url (2/2): http://mongodbcr.appspot.com/233440001

Comment by Esha Maharishi (Inactive) [ 27/Aug/18 ]

Code review url (1/2): http://mongodbcr.appspot.com/237470001

Comment by Esha Maharishi (Inactive) [ 22/Jun/18 ]

Per speaking offline with schwerin, we can just put an UninterruptibleLockGuard around the lock acquisition - it's a typical pattern to put the guard around lock acquisitions in a cleanup path.

Generated at Thu Feb 08 04:40:50 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.