[SERVER-60609] Sharding collection with UnoptimizedSplitPolicy must not result in long writes unavailability Created: 11/Oct/21  Updated: 13/Oct/21  Resolved: 13/Oct/21

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 5.0.3, 5.1.0-rc0
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Pierlauro Sciarelli Assignee: Marcos José Grillo Ramirez
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Duplicate
duplicates SERVER-60420 The slow 'shardCollection' path perfo... Closed
Problem/Incident
Sprint: Sharding EMEA 2021-11-15
Participants:

 Description   

Starting from v5.0, sharding a non-empty collection results in writes being blocked for a time proportional to the number of documents in the collection. As a result, sharding a collection containing a lot of documents means being just able to read it for a very long time.

 

The relevant difference between v4.4 and v5.0 is the context in which selectChunkSplitPoints is called in UnoptimizedSplitPolicy::createFirstChunks: in the older version it doesn't happen under the critical section while in the new one it happens after it gets acquired. The invoked splitVector command is quite slow because it scans the whole index to search for the initial split points.



 Comments   
Comment by Kaloian Manassiev [ 12/Oct/21 ]

pierlauro.sciarelli, the way I read it, this ticket is complementary to SERVER-60420 and we need to address both, right? We need to make both the index creation and the initial chunks creation to be out of the critical section for the unoptimised path.

Assigning it to marcos.grillo, because he worked on the createCollection path and because both probably have to be considered at once when fixing.

Generated at Thu Feb 08 05:50:16 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.