[SERVER-58413] Create/Find a workload that does CRUD ops while moving chunks around Created: 12/Jul/21  Updated: 29/Jul/21  Resolved: 29/Jul/21

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: None

Type: Task Priority: Major - P3
Reporter: Sergi Mateo Bellido Assignee: Antonio Fuschetto
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
is depended on by SERVER-58414 Evaluate the performance of the new w... Closed
is depended on by SERVER-58415 Evaluate the performance of the new w... Closed
Sprint: Sharding 2021-07-12, Sharding EMEA 2021-07-26, Sharding EMEA 2021-08-09
Participants:

 Description   

This workload should try to take advantage of the Fine-grained Collection Critical Sections that would be implemented as part of PM-2098. This means that the CRUD operations should target a set of chunks that doesn't include chunks being migrated.

There are some chunk-related workloads on the workloads repo. This one looks promising.



 Comments   
Comment by Antonio Fuschetto [ 26/Jul/21 ]

The previously identified test was rewritten to run very quickly on personal workstations and without the heavy infrastructure required by the workload framework.

It has been also changed in order to:

  1. Setup 2 shards and 1 sharded collection splitted into 3 chunks: 2 chunks contain 25K documents and are spread over the 2 shards, while 1 chunk contains 1 document and is stored on the first shard
  2. Start "n" background threads that continuously execute insert, remove, update and findOne operations on the sharded collection
  3. Move the the chunk containing 1 document back and forth between the first and second shards
  4. Record the throughput of CRUD and moveChunk operations, and the duration of the moveChunk's critical section at each step
  5. Reiterate the tests with different thread numbers (i.e., 0, 8, 16, 32, 64, 128, 256)

The implementation of the test is available at https://github.com/afuschetto/mongo-scripts/blob/main/performance/move_chunk_with_load.js (read permissions are required).

Comment by Antonio Fuschetto [ 14/Jul/21 ]

As initially assumed, a very good candidate to test the performance of the migration protocol and concurrent CRUD operations is the move_chunk_with_load.js in the workloads repository.

This test measures the throughput of multiple CRUD operations while a chunk continuously migrates from one shard to another, thereby highlighting the performance penalties introduced by the critical section of the migration protocol and measuring the overall impact on ordinary business operations.

Some details on the test implementation follow:

  1. Setup "n" shards, 1 sharded collection splitted into "n + 1" chunks each containing 25K document, and distribute 1 chunk per shard except the first one which hosts 2 chunks
  2. Start "n" background threads that continuously execute insert, remove, update and findOne operations on the sharded collection
  3. Move the first chunk (originally hosted by the fist shard) through the following shards
  4. Record the throughput of CRUD and moveChunk operations at different stages for a final report
  5. Reiterate the tests with different thread numbers

The logic seems more than sufficient for the purposes of this activity, however we could refine measurements and final reports by working on SERVER-58414.

Generated at Thu Feb 08 05:44:27 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.