[SERVER-33942] getMore with maxTimeMS returns "operation exceeded time limit" if concurrent blocking operation is running Created: 16/Mar/18 Updated: 29/Oct/23 Resolved: 05/Apr/18 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Querying |
| Affects Version/s: | None |
| Fix Version/s: | 3.7.4 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Nicholas Zolnierz | Assignee: | Charlie Swanson |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | todo_in_code | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||||||||||||||||||||||||||
| Issue Links: |
|
||||||||||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||||||||||
| Operating System: | ALL | ||||||||||||||||||||||||||||
| Backport Requested: |
v3.6
|
||||||||||||||||||||||||||||
| Sprint: | Query 2018-03-26, Query 2018-04-09 | ||||||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||||||
| Case: | (copied to CRM) | ||||||||||||||||||||||||||||
| Linked BF Score: | 23 | ||||||||||||||||||||||||||||
| Description |
|
Came from investigation on BF-8367, where the awaitdata_getmore_cmd.js fails when running in parallel with compact_keeps_indexes.js. The latter performs slow operations that hold the global DB lock, and the first is expecting a getMore to time out and return a batch size of 0. Instead, the getMore fails with "exceeded time limit". Reproducible test is attached. |
| Comments |
| Comment by Githook User [ 05/Apr/18 ] |
|
Author: {'email': 'charlie.swanson@mongodb.com', 'name': 'Charlie Swanson', 'username': 'cswanson310'}Message: |
| Comment by Charlie Swanson [ 30/Mar/18 ] |
|
Bringing this into the sprint to try to prevent/mitigate build failures. |
| Comment by Nicholas Zolnierz [ 23/Mar/18 ] |
|
Note as part of this fix, the blacklisted tests from |
| Comment by Charlie Swanson [ 23/Mar/18 ] |
|
Assigning to Nick to resolve the BF, we can file a separate ticket for the real bug fix. |
| Comment by Andy Schwerin [ 17/Mar/18 ] |
|
Sounds like fallout from the ticket to make lock acquisition interruptible, maybe? I wonder if await data cursors should always convert time limit exceeded errors into ok responses with whatever data has been pulled off the cursor? |