[SERVER-75594] Make analyzeShardKey command only use one config collection to store split points Created: 03/Apr/23 Updated: 29/Oct/23 Resolved: 07/Apr/23 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | 7.0.0-rc0 |
| Type: | Task | Priority: | Major - P3 |
| Reporter: | Cheahuychou Mao | Assignee: | Cheahuychou Mao |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||
| Sprint: | Sharding NYC 2023-04-17 | ||||||||||||
| Participants: | |||||||||||||
| Linked BF Score: | 25 | ||||||||||||
| Description |
|
Currently, the step for calculating the read and write distribution metrics in the analyzeShardKey command works as follows:
Making each command have its own split point collection is clean in a way but it has some notable downsides:
One can try to set up a periodic job to drop collections with this "config.analyzeShardKey.splitPoints.*" prefix. However, it is hard to differentiate between dangling collections and collections that are actually being used by some in-progress analyzeShardKey command. Given this, the analyzeShardKey command should instead only use one config collection for storing split points, and rely on a TTL index to automatically clean up documents for a command that has already returned. That is, the stage should have the following spec, where 'splitPointsFilter' is the filter that only match the split point documents generated by a particular command.
The users are likely to run a lot of analyzeShardKey commands back to back so there shouldn't be a lot of documents to filter out during the read. |
| Comments |
| Comment by Githook User [ 07/Apr/23 ] |
|
Author: {'name': 'Cheahuychou Mao', 'email': 'mao.cheahuychou@gmail.com', 'username': 'cheahuychou'}Message: |
| Comment by Adi Zaimi [ 07/Apr/23 ] |
|
Thanks, I went back to the design doc to get more clarity as well:
|
| Comment by Adi Zaimi [ 07/Apr/23 ] |
|
Right, so the documents have a ttl of 15min; I am not understanding in what way that is enough time. If you can remind me please, the split points created by an analyzeShardKey command are needed only briefly and not while we are analyzing the keys? |
| Comment by Adi Zaimi [ 07/Apr/23 ] |
|
Some questions when reading the above: - Cleanup to drop the unified collection will be left to the user to do manually? |