[SERVER-44570] Create a non process-fatal variant of invariant() Created: 11/Nov/19  Updated: 29/Oct/23  Resolved: 02/Nov/20

Status: Closed
Project: Core Server
Component/s: Internal Code
Affects Version/s: None
Fix Version/s: 4.9.0

Type: Improvement Priority: Major - P3
Reporter: Ian Boros Assignee: Kevin Pulo
Resolution: Fixed Votes: 2
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Duplicate
is duplicated by SERVER-44588 Switch invariant about inverting INEX... Closed
is duplicated by SERVER-47926 Less verbose uassert Closed
Related
related to SERVER-53002 Complete TODO listed in SERVER-44570 Closed
related to SERVER-53135 Tassert should only exit with error i... Closed
related to SERVER-55699 Add an explicit write concern to inte... Closed
is related to SERVER-52532 Investigate if code during signal han... Closed
Backwards Compatibility: Minor Change
Backport Requested:
v4.4, v4.2
Sprint: Service Arch 2019-11-18, Service Arch 2019-12-02, Service Arch 2019-12-16, Service Arch 2019-12-30, Service Arch 2020-01-13, Service Arch 2020-02-24, Service Arch 2020-03-09, Service Arch 2020-03-23, Service Arch 2020-04-06, Service arch 2020-04-20, Service arch 2020-05-04
Participants:

 Description   

There have been a number of times where an invariant() in the query planner makes it easy for a client to crash a stable version of the server (see below). While detection of a bug in the query planner should definitely be operation-fatal, it does not need to be process-fatal. In fact, making the exposure of such bugs process-fatal creates security vulnerabilities, especially for systems like the atlas free tier, where any user can read and write data (which makes it quite easy to trigger these kinds of invariants).

None of the assertion macros available in the server today fit into the right box for the query planner on the two dimensions of "indicates a bug OR indicates user error/unexpected situation" and "is process fatal OR operation fatal." That is:

invariant() indicates a bug, but is process fatal.
fassert() indicates a situation the server is not prepared for (not a bug) and is process fatal.
uassert() indicates a user error and is operation fatal.

We should consider adding a nonFatalInvariant() to the server which, when triggered, indicates that there is a bug, but is only operation-fatal. Instead of calling abort(), it could throw an exception with a special code (something like ErrorCodes::ThereIsABug). This would require changing our various fuzzers to be aware of this new error code, and to fail when they encounter it.

It's worth mentioning that we're not suggesting all invariants() in the query system should be non-fatal. We're only arguing that a nonFatalInvariant() should be available to the query system and used for logic checks (e.g. were these index bounds built correctly?) and not for checks about the state of the system (e.g. is this lock held?).

Examples of invariants in the query system which are fairly trivial to reproduce:



 Comments   
Comment by Benjamin Caimano (Inactive) [ 30/Nov/20 ]

Alright, I've filed SERVER-53135 to make the unclean exit test only. With that piece in place, we should be able to backport this to v4.4

Comment by Githook User [ 02/Nov/20 ]

Author:

{'name': 'Kevin Pulo', 'email': 'kevin.pulo@mongodb.com', 'username': 'devkev'}

Message: SERVER-44570 Add tripwire assertions (tassert)
Branch: master
https://github.com/mongodb/mongo/commit/7d8e64df2d2d56a821f638ef88aa619403d03d31

Comment by Mira Carey [ 12/Nov/19 ]

I'd considered making something like this that was fatal under enableTestCommands, exception throwing elsewhere.

That catches all of our testing, but presumably no production deployements

Comment by James Wahlin [ 12/Nov/19 ]

Maybe we could consider making this process fatal for certain test environments, maybe via setParameter? I think it would be useful to maintain for the fuzzer and maybe the concurrency test suites.

Comment by Ian Boros [ 11/Nov/19 ]

CC charlie.swanson david.storch

Generated at Thu Feb 08 05:06:21 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.