[SERVER-65281] AutoSplitVector should never choose null fields as split points Created: 06/Apr/22 Updated: 27/Oct/23 Resolved: 07/Apr/22 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Allison Easton | Assignee: | Allison Easton |
| Resolution: | Gone away | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||
| Operating System: | ALL | ||||
| Sprint: | Sharding EMEA 2022-04-18 | ||||
| Participants: | |||||
| Linked BF Score: | 136 | ||||
| Description |
|
The AutoSplitVector command chooses split points by scanning the documents in a collection and choosing the new split point to be the shard key values of the first document that makes the chunk too large. If this document has any of the shard key values unset or set to null, this will result in the new chunk having a max key that doesn't have the right number of key values. We should modify the AutoSplitVector so that it never chooses null values as a split point. |
| Comments |
| Comment by Allison Easton [ 07/Apr/22 ] | |
|
Closing this since it isn't the cause of the BF, and the actual cause has not yet been identified. Will reopen the BF ticket to continue investigation. | |
| Comment by Allison Easton [ 06/Apr/22 ] | |
|
The error happening in the test is this
I had thought that committing a chunk split where the value of key2 was null was causing this, but in writing a test to reproduce the error, this doesn't seem to be the case. I will update the ticket's summary once I figure out the actual cause. | |
| Comment by Max Hirschhorn [ 06/Apr/22 ] | |
Is this because autoSplitVector isn't treating a missing shard key field as if it were an explicit null value? I'm a little confused by this ticket's description because I don't see how an explicit null value is any more special than an explicit boolean true value (or something like that) to the chunk-splitting logic. |