[SERVER-65201] De-emphasize errors which are ignored by the ReshardingTest fixture when another error has already occurred Created: 01/Apr/22 Updated: 29/Oct/23 Resolved: 13/Jun/22 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Sharding |
| Affects Version/s: | None |
| Fix Version/s: | 6.1.0-rc0 |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Max Hirschhorn | Assignee: | Max Hirschhorn |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | sharding-nyc-subteam1 | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||
| Sprint: | Sharding 2022-06-27 | ||||||||
| Participants: | |||||||||
| Linked BF Score: | 166 | ||||||||
| Story Points: | 2 | ||||||||
| Description |
|
To avoid hanging or crashing the mongo shell process, the ReshardingTest fixture goes through some lengths to interrupt the reshardCollection command on mongos and join the background thread in the mongo shell which was running the reshardCollection command. The error from the background thread is still logged just in case there's a bug in the ReshardingTest fixture and the message happens to be useful. However, the logged error has led to some confusion for what truly caused the test to fail. We should find a way to elide the self-induced interruption error or separate it more from the assertion failure error message which immediately follows it. |
| Comments |
| Comment by Githook User [ 13/Jun/22 ] |
|
Author: {'name': 'Max Hirschhorn', 'email': 'max.hirschhorn@mongodb.com', 'username': 'visemet'}Message: Moves the check for whether the resharding operation run by the |
| Comment by Max Hirschhorn [ 04/Apr/22 ] |
|
After some discussion within the team we would rather not special case the Interrupted error code. Having a "compact error reporting" mode for assert.commandWorked() was proposed by Brett and may be the preferred approach here. |
| Comment by Max Hirschhorn [ 01/Apr/22 ] |
|
My initial thought was to use tojsononeline() instead of tojson() within the error messages being ignored so they take up less visual space and that the JavaScript error and stacktrace which immediately follows will be understood as the true reason for the failure. However, tojson() is really being called by assert.commandWorked() in the background thread running the reshardCollection command so it won't be possible to condense the output with that approach. My new thought would be to use a CountDownLatch to signal to the background thread running the reshardCollection command that the main thread has issued a killOp command. And then to have the background thread running the reshardCollection command to use assert.commandWorkedOrFailedWithCode(res, [ErrorCodes.Interrupted]). |