[SERVER-36872] Comment out $sample tests in testshard1.js temporarily Created: 24/Aug/18 Updated: 29/Oct/23 Resolved: 27/Aug/18 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Aggregation Framework |
| Affects Version/s: | None |
| Fix Version/s: | 4.1.3 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Matthew Saltz (Inactive) | Assignee: | Matthew Saltz (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||
| Operating System: | ALL | ||||||||||||||||
| Sprint: | Sharding 2018-09-10 | ||||||||||||||||
| Participants: | |||||||||||||||||
| Linked BF Score: | 0 | ||||||||||||||||
| Description |
|
The sample tests for this particular suite are currently causing non-deterministic hangs in Evergreen. We should remove them temporarily pending a fix to |
| Comments |
| Comment by Matthew Saltz (Inactive) [ 22/Oct/18 ] |
|
I confirmed we actually intentionally create empty chunks and distribute them across nodes when we shard a collection with hashed or zoned sharding. |
| Comment by Matthew Saltz (Inactive) [ 19/Oct/18 ] |
|
There are a lot of tickets related to the behavior change but the behavior change simply surfaced an existing issue by hitting a new scenario in our testing. I don't think this is what specifically happened in this test, but one way to end up with an empty chunk is if you delete all documents belonging to a range specified by a chunk. The chunk itself doesn't get removed, so a shard can still be targeted even if there are no documents on it. Alternatively it's also possible to manually split a chunk (even an empty one) so that one of the resulting chunks is empty and can be migrated to a shard with no other chunks from that collection on it. In fact, I believe we sometimes recommend doing this when you create a new sharded collection so that you can spread out your writes to different shards, so you could easily end up with a single empty chunk on a shard in that case. |
| Comment by Charlie Swanson [ 11/Oct/18 ] |
|
matthew.saltz can you point me to the ticket or tickets which included the change you describe? I am still confused about how we would end up with an empty chunk as the only chunk on a shard, and would like to investigate further to see if this is expected, or can at least be prevented. |
| Comment by Githook User [ 27/Aug/18 ] |
|
Author: {'name': 'Matthew Saltz', 'email': 'matthew.saltz@mongodb.com', 'username': 'saltzm'}Message: |
| Comment by Matthew Saltz (Inactive) [ 27/Aug/18 ] |
|
Yes. The autosplitter was moved from triggering splits synchronously on mongos to asynchronously on mongod. So before, the test would do some inserts, block on the autosplitter, which would move a chunk from one shard to the other due to the top chunk optimization, and then continue inserting. Now, the inserts never block, so the splitting and the chunk move happens concurrently with other operations (including $sample). The chunk move causes $sample to try to read documents not owned by that shard, which causes the test to hit |
| Comment by David Storch [ 27/Aug/18 ] |
|
matthew.saltz was there a change which caused this test to hit |