[SERVER-28920] Increase verbosity on sharding/remove2.js and make timeout so high that we get a stack dump -- temporarily to help diagnose test failure. Created: 21/Apr/17 Updated: 27/Oct/23 Resolved: 24/Apr/17 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Sharding |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Dianna Hohensee (Inactive) | Assignee: | Dianna Hohensee (Inactive) |
| Resolution: | Gone away | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Backwards Compatibility: | Fully Compatible |
| Operating System: | ALL |
| Sprint: | Sharding 2017-05-08 |
| Participants: |
| Description |
|
Increase the verbosity to try and get more information on what's happening in the sharding system – BF-5335 isn't logging much of anything, so we can't see what's happening, if anything. Also increase the assert.soon to greater than 2 hours so we get stack dumps and can make sure we didn't add any deadlocks somewhere recently. |
| Comments |
| Comment by Dianna Hohensee (Inactive) [ 24/Apr/17 ] |
|
Thanks for the suggestion/tips, Max! Kal thinks he figured out what was going on, so I'm going to close this. |
| Comment by Max Hirschhorn [ 21/Apr/17 ] |
dianna.hohensee, I would recommend running a patch build with the "timeout_secs" property of the sharding task set to a lower value to make the hang_analyzer.py script trigger sooner (or trying to reproduce the timeout locally) rather than increasing the assert.soon() timeout to greater than 2 hours. See where we do this for the jstestfuzz* tasks as an example. By the way, you'll probably also want to go and vote for |