[SERVER-39498] ShardRegistry reload inside onReplicationRollback can get stuck Created: 11/Feb/19  Updated: 29/Oct/23  Resolved: 08/May/19

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: 4.1.7
Fix Version/s: 4.1.11, 4.0.19

Type: Bug Priority: Major - P3
Reporter: Randolph Tan Assignee: Matthew Saltz (Inactive)
Resolution: Fixed Votes: 0
Labels: sharding-wfbf-day
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
Problem/Incident
is caused by SERVER-37929 ShardRegistry in config servers can k... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v4.0
Sprint: Sharding 2019-04-22, Sharding 2019-05-06, Sharding 2019-05-20
Participants:
Linked BF Score: 23

 Description   

Repro scenario:
1. Rollback occurred.
2. Periodic shard registry reload tries to perform shard reload. It is done with majority readConcern and the latest configOpTime. However, since a rollback just occurred, configOpTime > lastAppliedOpTime, so the reload will block.
3. Rollback finishes fixing the oplog and record store. Now calls the OpObserverImpl::onReplicationRollback.
4. Rollback thread tries to call ShardRegistry reload, but since the periodic reload thread is in the middle of reload, it just tries to wait for it to finish. And this causes cyclic dependency since the opTime won't advance until the rollback thread finishes.



 Comments   
Comment by Githook User [ 04/May/20 ]

Author:

{'name': 'Matthew Saltz', 'email': 'matthew.saltz@mongodb.com', 'username': 'saltzm'}

Message: SERVER-39498 Make rollback trigger a lazy (rather than blocking) reload of the ShardRegistry

(cherry picked from commit d3ee35d6e3ac7a42cd5ad106c3ecb9fb554900c7)
Branch: v4.0
https://github.com/mongodb/mongo/commit/cc923b4e66e0c76714f877b33ce9fd5f28e17f62

Comment by Jack Mulrow [ 27/Apr/20 ]

Requesting 4.0 backport becauseĀ SERVER-37929 was backported to that branch, which caused the BF this ticket fixed.

Comment by Githook User [ 08/May/19 ]

Author:

{'name': 'Matthew Saltz', 'username': 'saltzm', 'email': 'matthew.saltz@mongodb.com'}

Message: SERVER-39498 Make rollback trigger a lazy (rather than blocking) reload of the ShardRegistry
Branch: master
https://github.com/mongodb/mongo/commit/d3ee35d6e3ac7a42cd5ad106c3ecb9fb554900c7

Generated at Thu Feb 08 04:52:15 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.