[SERVER-67078] Advancing just the minor version on the primary of a shard should not stall the secondaries Created: 07/Jun/22  Updated: 16/Nov/23  Resolved: 16/Nov/23

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 3.6.23, 4.0.28, 4.4.13, 4.2.20, 5.0.9, 6.0.0-rc8
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Kaloian Manassiev Assignee: Backlog - Catalog and Routing
Resolution: Won't Do Votes: 0
Labels: oldshardingemea
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
is duplicated by SERVER-62698 Do not clear filtering metadata on se... Closed
Gantt Dependency
Related
Assigned Teams:
Catalog and Routing
Operating System: ALL
Participants:

 Description   

The following sequence is possible:

  1. The primary of a shard performs a split or commits chunk migration (For the purposes of this ticket we will just consider splits, because they are more problematic since they are more frequent and only bump the minor version).
  2. As part of the split, we bump the minor version of the collection, but still advance the filtering metadata (shardVersion) on the primary.
  3. This causes the newly split chunks to be written to the config.cache.chunks.XXX collection, but since the write is not atomic, we first need to write a refreshing:true entry and then clear it once we have written all the changes.
  4. Upon seeing the first write from the previous step, the secondary will throw out its filtering metadata (shardVersion), which means that any read which comes to that secondary now will stall and will wait for the primary to complete at least one refresh from the CSRS and clear the refreshing flag.
  5. The secondary is stalled until the primary completes one round of refresh from the CSRS
  6. By the time it loops around though in order to read the new metadata, the primary might have committed another split and that split would have generated yet another refreshing:true.

This loop potentially has a liveness problem if there are too many splits (or merges and moves) happening on the primary back to back, since it might not be able to complete. For moves, since they happen much less frequently it has normally not been a problem, but for splits it definitely is.



 Comments   
Comment by Matt Panton [ 16/Nov/23 ]

Executing the ticket as currently described is not aligned with the long-term vision of the server and there are a couple of mitigations that should lessen the likelihood of encountering this issue as a long-term solution is found:

  • The removal of the autosplitter in MongoDB 6.0.3

 

 

Comment by Matt Panton [ 16/Nov/23 ]

dinesh.chander@mongodb.com - is the customer still on 4.2 because if they've upgraded to 4.4.25 that includes SERVER-71627 which makes refreshes orders of magnitude faster for large sharded collections with a large amount of chunks. 

Generated at Thu Feb 08 06:07:14 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.