[SERVER-44570] Create a non process-fatal variant of invariant() Created: 11/Nov/19 Updated: 29/Oct/23 Resolved: 02/Nov/20 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Internal Code |
| Affects Version/s: | None |
| Fix Version/s: | 4.9.0 |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Ian Boros | Assignee: | Kevin Pulo |
| Resolution: | Fixed | Votes: | 2 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||||||||||||||
| Backwards Compatibility: | Minor Change | ||||||||||||||||||||||||||||||||||||
| Backport Requested: |
v4.4, v4.2
|
||||||||||||||||||||||||||||||||||||
| Sprint: | Service Arch 2019-11-18, Service Arch 2019-12-02, Service Arch 2019-12-16, Service Arch 2019-12-30, Service Arch 2020-01-13, Service Arch 2020-02-24, Service Arch 2020-03-09, Service Arch 2020-03-23, Service Arch 2020-04-06, Service arch 2020-04-20, Service arch 2020-05-04 | ||||||||||||||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||||||||||||||
| Description |
|
There have been a number of times where an invariant() in the query planner makes it easy for a client to crash a stable version of the server (see below). While detection of a bug in the query planner should definitely be operation-fatal, it does not need to be process-fatal. In fact, making the exposure of such bugs process-fatal creates security vulnerabilities, especially for systems like the atlas free tier, where any user can read and write data (which makes it quite easy to trigger these kinds of invariants). None of the assertion macros available in the server today fit into the right box for the query planner on the two dimensions of "indicates a bug OR indicates user error/unexpected situation" and "is process fatal OR operation fatal." That is: invariant() indicates a bug, but is process fatal. We should consider adding a nonFatalInvariant() to the server which, when triggered, indicates that there is a bug, but is only operation-fatal. Instead of calling abort(), it could throw an exception with a special code (something like ErrorCodes::ThereIsABug). This would require changing our various fuzzers to be aware of this new error code, and to fail when they encounter it. It's worth mentioning that we're not suggesting all invariants() in the query system should be non-fatal. We're only arguing that a nonFatalInvariant() should be available to the query system and used for logic checks (e.g. were these index bounds built correctly?) and not for checks about the state of the system (e.g. is this lock held?). Examples of invariants in the query system which are fairly trivial to reproduce:
|
| Comments |
| Comment by Benjamin Caimano (Inactive) [ 30/Nov/20 ] |
|
Alright, I've filed |
| Comment by Githook User [ 02/Nov/20 ] |
|
Author: {'name': 'Kevin Pulo', 'email': 'kevin.pulo@mongodb.com', 'username': 'devkev'}Message: |
| Comment by Mira Carey [ 12/Nov/19 ] |
|
I'd considered making something like this that was fatal under enableTestCommands, exception throwing elsewhere. That catches all of our testing, but presumably no production deployements |
| Comment by James Wahlin [ 12/Nov/19 ] |
|
Maybe we could consider making this process fatal for certain test environments, maybe via setParameter? I think it would be useful to maintain for the fuzzer and maybe the concurrency test suites. |
| Comment by Ian Boros [ 11/Nov/19 ] |