[SERVER-60983] Evaluate the performance of the new way of filtering writes to orphaned documents Created: 26/Oct/21 Updated: 30/Dec/21 Resolved: 30/Dec/21 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Sharding |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Task | Priority: | Major - P3 |
| Reporter: | Sergi Mateo Bellido | Assignee: | Antonio Fuschetto |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Sprint: | Sharding EMEA 2021-11-01, Sharding EMEA 2021-11-15, Sharding EMEA 2021-11-29, Sharding EMEA 2021-12-13, Sharding EMEA 2021-12-27, Sharding EMEA 2022-01-10 |
| Participants: |
| Description |
|
The goal of this task is to evaluate the performance impact of the new way of filtering writes on orphaned documents as part of PM-2423. The first task is to check if we have a benchmark already measuring the throughput of writes. At the end what we want to measure is the overhead introduced by this new way of filtering writes compared to the previous implementation. 1st workload: without orphaned documents
2nd workload: with orphaned documents |
| Comments |
| Comment by Antonio Fuschetto [ 29/Dec/21 ] | ||||||||||
IntroductionTo measure the performance degradation introduced by the new mechanism to filter our write operations on orphaned documents (i.e. The new test measures the executions time of the following use-cases using sharded collections of different cardinalities and sizes:
Update and delete operations have been re-executed using both empty and non-empty queries to evaluate the performance penalties in scenarios where we assumed that the evaluation of the query had a non negligible cost in the total execution time. These operations were executed 5 times on 5 different collections (of the same type) obtaining the average value and taking into account the standard deviation to discard possibly distorted samples (e.g. caused by system processes running on the dedicated test machine). The experiment was repeated using collections with 100, 1K, 10K, 100K and 1M documents, and with document sizes of 128B, 512B, 1KB, 2KB and 1MB. ResultsThe obtained results highlighted that the current filtering logic on orphaned documents introduces a penalty in the execution time of about 5-6% on for update operations and 7-8% for delta operations. This value does not change significantly on varying the number of documents in the collection and their size.
Further experiments also showed that having a huge number of chunks (e.g. 100K) affects performance but, as the Sharding team is actively working to avoid this type of scenarios (i.e. PM-2321), the analysis did not focus on that. Detailed information on different test cases and results is available in SERVER-59832 - Performance tests. ConclusionThe current implementation to filter out write operations on orphaned documents (i.e. In order to minimize the computational cost of this logic, several areas for improvement have been identified. Dedicated tasks will be created accordingly.
1 The CRUD workloads test uses the same collection to measure the operation throughput. It runs different types of operations (e.g. delete and insert) to preserve the status of the collection for subsequent test cases, leading the measurement of the execution time for each single type of operation cumbersome and imprecise for our purposes. |