[SERVER-41353] Fix late calls to errnoWithDescription() Created: 29/May/19  Updated: 29/Oct/23  Resolved: 28/Apr/22

Status: Closed
Project: Core Server
Component/s: Diagnostics, Internal Code
Affects Version/s: None
Fix Version/s: 6.1.0-rc0

Type: Bug Priority: Minor - P4
Reporter: Kevin Pulo Assignee: Billy Donahue
Resolution: Fixed Votes: 0
Labels: logging, move-sec, sa-groomed
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
is depended on by SERVER-51770 generalize errnoWithDescription to ha... Closed
Related
related to SERVER-50253 errno-style handlers conflate error c... Closed
related to SERVER-40953 Shell history file accesses should ca... Closed
Backwards Compatibility: Fully Compatible
Backport Requested:
v5.0, v4.4
Sprint: Service Arch 2022-03-21, Service Arch 2022-04-04, Service Arch 2022-04-18, Service Arch 2022-05-02, Service Arch 2022-05-16
Participants:
Story Points: 4

 Description   

Calling errnoWithDescription() without any arguments causes it to refer to the errno global. However, unless this is done immediately after returning from a function call that might set errno, the value of errno might get trampled by other code which calls errno-setting functions. This can cause the wrong error string to be reported by the failure message, including situations where an error is reported, but the cause of the error is confusingly "Success".

There are currently 100 call sites of errnoWithDescription() with no args (and 1 in enterprise). 62 of these are inside the streaming operator, which is too late (at least potentially, because the constructor of the object being streamed into might do something that resets errno).

This ticket is to audit all usages of errnoWithDescription() with no args, and clean up those where it's called too late — or come up with some sane way to automatically capture the relevant value of errno before there is any possibility of it being overwritten.

$ git describe
r4.1.12-20-g20309e67435
$ rg 'errnoWithDescription\(\)'
transport/transport_layer_asio.cpp
817:                const auto ewd = errnoWithDescription();
 
logger/rotatable_file_writer.cpp
237:                // TODO(schwerin): Make errnoWithDescription() available in the logger library, and
 
db/server_options_test.cpp
902:    ASSERT_NOT_EQUALS(pid, -1) << "unable to fork: " << ::mongo::errnoWithDescription();
 
dbtests/jsontests.cpp
422:            FAIL(errnoWithDescription());
438:                FAIL(errnoWithDescription());
451:                FAIL(errnoWithDescription());
 
shell/shell_utils_extended.cpp
144:    uasserted(16832, str::stream() << "cd command failed: " << errnoWithDescription());
 
shell/shell_utils_launcher.cpp
134:        const auto ewd = errnoWithDescription();
411:            const auto ewd = errnoWithDescription();
422:            const auto ewd = errnoWithDescription();
428:            const auto ewd = errnoWithDescription();
632:        const auto ewd = errnoWithDescription();
663:        const auto ewd = errnoWithDescription();
711:        const auto ewd = errnoWithDescription();
730:        const auto ewd = errnoWithDescription();
920:            const auto ewd = errnoWithDescription();
964:        const auto ewd = errnoWithDescription();
973:            const auto ewd = errnoWithDescription();
 
shell/dbshell.cpp
720:        std::cout << "couldn't create temp file (" << filename << "): " << errnoWithDescription()
759:        std::cout << "couldn't open temp file on return from editor: " << errnoWithDescription()
770:            std::cout << "failed to read temp file: " << errnoWithDescription() << std::endl;
 
util/errno_util.h
44: * Return the errno (error number) and the errnoWithDescription() string.
49: * Return a string with the given prefix followed by the errnoWithDescription() string.
 
platform/shared_library_windows.cpp
61:                                                                        << errnoWithDescription());
79:                                                   << errnoWithDescription());
 
shell/linenoise.cpp
2783:        const auto ewd = mongo::errnoWithDescription();
2788:        const auto ewd = mongo::errnoWithDescription();
2796:        const auto ewd = mongo::errnoWithDescription();
2804:                const auto ewd = mongo::errnoWithDescription();
2814:        const auto ewd = mongo::errnoWithDescription();
2829:        const auto ewd = mongo::errnoWithDescription();
2849:        const auto ewd = mongo::errnoWithDescription();
2856:        const auto ewd = mongo::errnoWithDescription();
 
db/initialize_server_global_state.cpp
123:            cout << "ERROR: stage 1 fork() failed: " << errnoWithDescription();
159:            cout << "ERROR: stage 2 fork() failed: " << errnoWithDescription();
 
util/file.cpp
211:          << errnoWithDescription();
218:              << errnoWithDescription();
232:    log() << "In File::len(), lseek for '" << _name << "' failed with " << errnoWithDescription();
253:              << errnoWithDescription();
262:              << errnoWithDescription();
284:              << errnoWithDescription() << std::endl;
295:              << errnoWithDescription();
 
util/processinfo_linux.cpp
74:            ss << "couldn't open [" << name << "] " << errnoWithDescription();
604:        log() << "mincore failed: " << errnoWithDescription();
615:        log() << "mincore failed: " << errnoWithDescription();
 
util/processinfo_freebsd.cpp
175:        log() << "mincore failed: " << errnoWithDescription();
185:        log() << "mincore failed: " << errnoWithDescription();
 
util/processinfo_solaris.cpp
72:                str::stream() << "couldn't open \"/proc/self/psinfo\": " << errnoWithDescription(),
89:                str::stream() << "couldn't open \"/proc/self/usage\": " << errnoWithDescription(),
201:        warning() << "lgrp_init failed: " << errnoWithDescription();
210:        warning() << "lgrp_nlgrps failed: " << errnoWithDescription();
226:        log() << "mincore failed: " << errnoWithDescription();
237:        log() << "mincore failed: " << errnoWithDescription();
 
db/startup_warnings_mongod.cpp
204:                warning() << "failed to read from /proc/self/numa_maps: " << errnoWithDescription()
310:        const auto errmsg = errnoWithDescription();
351:        const auto errmsg = errnoWithDescription();
 
util/stacktrace_windows.cpp
88:                    << errnoWithDescription();
 
util/shell_exec.cpp
72:                              << errnoWithDescription(),
78:                              << errnoWithDescription(),
83:                              << errnoWithDescription(),
89:                              << errnoWithDescription(),
94:                str::stream() << "Unable to launch command: " << errnoWithDescription(),
114:                              << errnoWithDescription(),
119:                    str::stream() << "Failed terminating subprocess: " << errnoWithDescription(),
134:                              << errnoWithDescription(),
147:            return {ErrorCodes::OperationFailed, errnoWithDescription()};
160:                str::stream() << "Failed reading from subprocess: " << errnoWithDescription(),
211:                str::stream() << "Unable to launch command: " << errnoWithDescription(),
237:            return {ErrorCodes::OperationFailed, errnoWithDescription()};
 
util/processinfo_openbsd.cpp
194:        log() << "mincore failed: " << errnoWithDescription();
204:        log() << "mincore failed: " << errnoWithDescription();
 
util/password.cpp
62:            std::cerr << "Cannot get terminal attributes " << errnoWithDescription() << std::endl;
69:            std::cerr << "Cannot set terminal attributes " << errnoWithDescription() << std::endl;
80:            std::cerr << "Cannot set terminal attributes " << errnoWithDescription() << std::endl;
 
util/time_support_test.cpp
62:        return Status(ErrorCodes::BadValue, errnoWithDescription());
 
util/processinfo_osx.cpp
226:        log() << "mincore failed: " << errnoWithDescription();
235:        log() << "mincore failed: " << errnoWithDescription();
 
db/storage/storage_engine_metadata.cpp
236:                          << errnoWithDescription(),
276:                              << errnoWithDescription());
287:                                        << errnoWithDescription());
 
util/signal_handlers.cpp
136:        warning() << "eventProcessingThread CreateEvent failed: " << errnoWithDescription();
146:                      << errnoWithDescription();
 
db/storage/remove_saver.cpp
95:                     << " for remove saving: " << redact(errnoWithDescription());
119:                     << " for remove saving: " << redact(errnoWithDescription());
135:                                       << " for remove saving: " << redact(errnoWithDescription());
170:                                   << " for remove saving: " << redact(errnoWithDescription());
 
db/storage/storage_repair_observer.cpp
114:                 << errnoWithDescription();
 
util/errno_util.cpp
120:    const auto suffix = errnoWithDescription();
 
db/storage/storage_file_util.cpp
76:                              << errnoWithDescription()};
 
db/storage/storage_engine_lock_file_posix.cpp
72:                          << errnoWithDescription(),
 
util/net/sock.cpp
148:        error() << "disableNagle failed: " << errnoWithDescription();
152:        error() << "SO_KEEPALIVE failed: " << errnoWithDescription();
644:                  << " remote host " << remoteString() << ")" << causedBy(errnoWithDescription());
670:                      << causedBy(errnoWithDescription());
 
util/net/ssl_manager_openssl.cpp
1209:                str::stream() << "error opening system CA store: " << errnoWithDescription()};
1726:                error() << "The SSL BIO reported an I/O error " << errnoWithDescription();
 
util/net/socket_utils.cpp
74:            log() << "ERROR: wsastartup failed " << errnoWithDescription();
151:                error() << "can't get " << optname << ": " << errnoWithDescription();
157:                    error() << "can't set " << optname << ": " << errnoWithDescription();
201:        log() << "can't get this server's hostname " << errnoWithDescription();

Acceptance criteria: Remove the errnoWithDescription overload that takes no arguments and fix the affected call sites.



 Comments   
Comment by Jason Chan [ 08/May/23 ]

Requesting backport of only the library additions to errno_util.h/cpp and the accompanying unit tests to help with backports of other features that make use of these helpers.

Comment by Githook User [ 28/Apr/22 ]

Author:

{'name': 'Billy Donahue', 'email': 'billy.donahue@mongodb.com', 'username': 'BillyDonahue'}

Message: SERVER-41353 replace errnoWithDescription with an API based on std::error_code
Branch: master
https://github.com/mongodb/mongo/commit/5b17ba836945f2b7cbcc77a643311f41fe7b7b6e

Comment by Billy Donahue [ 25/Mar/22 ]

https://github.com/10gen/mongo/pull/4036

Comment by Billy Donahue [ 20/Oct/20 ]

We have a problem with errnoWithDescription's overload set and usage, as described here.

When looking into fixing it, I found we have another problem, that even if we capture the errors at the right time,
we aren't interpreting them in the right way.

Splitting this ticket out into another one dedicated to the <system_error> work.

We should be using <system_error> to propagate error categories and lift errno and GetLastError and other separate and distinct kinds of int-based error codes into C++ with type safety. Then something like errnoWithDescription (which would be renamed) can be done safely.

Split the <system_error> work into ticket SERVER-50253.

Comment by Billy Donahue [ 11/Jul/20 ]

After replacing all the errnoWithDescription with a mongo::Errno handler, I discovered there's sloppy conflation of errno with GetLastError codes. Win32 has both an errno int (signed 32-bit) and a GetLastError DWORD (unsigned 32-bit). They aren't in the same number space, and a DWORD can't be portably stored into a signed int without special care (which we aren't taking). I correctly get compiler errors from the narrowing conversions of DWORD to int after textually replacing errnoWithDescription(e) expressions to Errno{e}.desc() expressions. Braced-init doesn't tolerate narrowing!

Logically, a function called errnoWithDescription should only be looking at errno codes, but we have it so that it does a GetLastError on Win32, and there is no helper function analyze the `errno` codes (as set by _read, etc) on Win32. We need to keep GetLastError and errno regimes separated to do this right. We'll have clearer and more correct error handling.

Comment by Billy Donahue [ 10/Jul/20 ]

CR https://mongodbcr.appspot.com/637870001/

Comment by Billy Donahue [ 10/Jul/20 ]

I looked at more cases. There are a lot of bugs of this type in the code. Some cases are using errnoWithDescription with no argument, but other bug sites are using it with a stale argument. We can sweep the code once, but how do we keep the barn door closed? I'm going to make a {mongo::Errno} class, and get rid of all the errnoWith... nonmember functions. I think that will help us create an idiomatic push so that more call sites will handle errno (and Win32's GetLastError) safely going forward. Unless we ban and wrap all the libc errno-manipulating functions, I don't think we can enforce correct errno discipline, but we can get closer than we are now.

 

Generated at Thu Feb 08 04:57:29 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.