[SERVER-47332] Add debug only logging macro to facilitate printf style debugging Created: 03/Apr/20  Updated: 29/Oct/23  Resolved: 24/Apr/20

Status: Closed
Project: Core Server
Component/s: Logging
Affects Version/s: None
Fix Version/s: 4.7.0

Type: Improvement Priority: Major - P3
Reporter: William Schultz (Inactive) Assignee: Henrik Edin
Resolution: Fixed Votes: 1
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-46364 Duplicate id checking should occur du... Closed
is related to SERVER-71745 Move unstructured log lint to clang-tidy Closed
Backwards Compatibility: Fully Compatible
Sprint: Execution Team 2020-05-04
Participants:

 Description   

When debugging or investigating a part of the system it is helpful to quickly add log messages for "printf" style debugging. The new LOGV2 macros make this much more unwieldy, since they require a log message id, and variable values can only be included in a message by adding attributes which is verbose. It would be nice to have a dedicated macro for printf debugging that provides the ability to print a string without providing a message id or any named attributes, since those are features are mostly relevant for production log messages and machine readability. They should not be necessary for local debugging. We could add a macro like logv2d() that behaves similarly to the old logging system. It would automatically use a default message id (since we have relaxed id checks in SERVER-46364) and provide an ability to use the stream operator to construct log messages e.g.

logv2d() << "my log message, v1:" << 5 << ", v2: " << 6;

Preferably this macro would be lowercase, to make it even more ergonomic when writing out printf style debugging messages quickly. We could also consider statically asserting that this macro is only allowed in debug builds, to make sure people don't use this macro to log messages in production.



 Comments   
Comment by Githook User [ 02/May/20 ]

Author:

{'name': 'Henrik Edin', 'email': 'henrik.edin@mongodb.com', 'username': 'henrikedin'}

Message: SERVER-47332 Add way to do unstructured logging for local development

(cherry picked from commit 60ed56e7246f862c5874f6c346d1b1ec6bc948a8)
Branch: v4.4
https://github.com/mongodb/mongo/commit/6495997f2ddc6a29d89bf5302035963d6eac0c4e

Comment by Githook User [ 24/Apr/20 ]

Author:

{'name': 'Henrik Edin', 'email': 'henrik.edin@mongodb.com', 'username': 'henrikedin'}

Message: SERVER-47332 Add way to do unstructured logging for local development
Branch: master
https://github.com/mongodb/mongo/commit/60ed56e7246f862c5874f6c346d1b1ec6bc948a8

Comment by Billy Donahue [ 22/Apr/20 ]

Yes it seems good as a function. If it's a function, it can be given a real nonzero log id, and all the logd calls use that. This is similar to what we do with s2geometry log statements and other places where the logv2 system is being "hooked into" and all the statements have no real individual identity.

If it produces a valid log statement, we could let calls to it be checked into master branch. Even such ad-hoc instrumentation can be a valuable debug artifact, safely hiding behind severity levels or ifdefs until it's needed. This could be a consideration for later though.

Comment by Henrik Edin [ 22/Apr/20 ]

Update: As we will need a new API for this and it won't be committed to master I don't feel that it is particularly important that we follow the regular Structured Logging rules.

Unless anyone objects I will:

Make logd a function instead of a macro so it can be in all lower-case.

Log the formatted string as the message string and not have any attributes. It will result in the JSON output being slightly shorter with less metadata.

The final output will be as in your original example william.schultz:

{"t":{"$date":"2020-04-07T21:19:49.470-04:00"},"s":"I", "c":"REPL",    "id":0,      "ctx":"OplogApplier-0","msg":"here's value 1: 12, and value 2: hello"}

Comment by William Schultz (Inactive) [ 22/Apr/20 ]

Awesome! Looks great to me.

Comment by Henrik Edin [ 22/Apr/20 ]

Like regular LOGV2 macros but without ""_attr=.

So exactly like your example but with a different macro name

int val1 = 12;
std::string val2 = "hello";
BSONObj obj = ...;
struct UserType
{
    UserType() = default;
    BSONObj toBSON() const;
};
LOGD("here's value 1: {}, and value 2: {}, 3: {}, 4: {}", val1, val2, obj, UserType());

Comment by William Schultz (Inactive) [ 22/Apr/20 ]

Sounds good to me. My only question is about how you write attributes. Can you give a few examples of log messages with attributes included using this macro? I'm unclear what the ... represents in LOGD(MESSAGE, ...).

Comment by Henrik Edin [ 22/Apr/20 ]

I took over this ticket from Eric, here's my implementation plan for everyone to provide feedback for before I start the actual implementation:

Macro signature:

#define LOGD(MESSAGE, ...)

It will use a dummy ID and the default severity and component. I will not implement a LOGD_OPTIONS for further customizations in logging.

We don't need to have V2 suffixed, when V1 is properly cleaned up we should get rid of all v2 in the codebase.

I will add a lint check that prevents uses of this macro from being checked in. It is my understanding that this is just for quick and dirty prototyping.

It will log everything as a single attribute string, the backend will not be using fmt::format directly as we want the same rules on how to make types loggable to apply to this macro as the regular LOGV2 macros.

Comment by Eric Milkie [ 09/Apr/20 ]

As per Billy's suggestions, I'm going to try something along the lines of the proposal here.

Comment by Bruce Lucas (Inactive) [ 09/Apr/20 ]

Since we're moving away from format strings in msg in general could I suggest that the msg could just be something like "DBG", with "dbg"_attr holding the formatted debug message. This has the additional benefit that it might be easier to spot "DBG" visually and easier to grep.

Comment by Billy Donahue [ 09/Apr/20 ]

I will just say that the 'msg' must not be dynamically generated. We have a rule about that.
I was trying to be compliant with structured logging's conventions in my suggestion, so I picked format-string "{dbg}", and "dbg"_attr to hold the formatted string.

I think the number 2 in these LOGV2 macros is annoying (and 2 chars longer than necessary). When we have no more LOG statements we should technically be able to cut V2 off all the macro names.

Comment by Daniel Gottlieb (Inactive) [ 08/Apr/20 ]

If there are other people who would find this feature useful it would be good to hear their opinions as well.

I'm down with Will's suggestion. I also share the desires and sobering reality of casing.

Comment by William Schultz (Inactive) [ 08/Apr/20 ]

Ok, thanks for the info. Here's a concrete suggestion for a macro definition that I think is reasonable. If there are other people who would find this feature useful it would be good to hear their opinions as well.

// Definition.
#define LOGV2D(FMTSTR_MESSAGE, ...)                                       \
    ::mongo::logv2::detail::doLog(                                        \
        0, /* use a dummy message id */                                   \
        ::mongo::logv2::LogSeverity::Log(),                               \
        ::mongo::logv2::LogOptions{MongoLogV2DefaultComponent_component}, \
        fmt::format(FMTSTR_MESSAGE, ##__VA_ARGS__))
 
// Example usage.
int val1 = 12;
std::string val2 = "hello";
LOGV2D("here's value 1: {}, and value 2: {}", val1, val2);

This is a macro that I got to compile and run locally, so there may be a simpler way to implement it, but I think the macro interface is reasonable. Using a default log message id seems fine to me since we don't check message id uniqueness at compile time, only during lint. The output from the example log message looks like the following:

{"t":{"$date":"2020-04-07T21:19:49.470-04:00"},"s":"I", "c":"REPL",    "id":0,      "ctx":"OplogApplier-0","msg":"here's value 1: 12, and value 2: hello"}

To really nitpick, I'd still prefer a lowercase macro but I'm ok with giving that up to be consistent in our macro style. If the macro is uppercase, though, then I'd also vote for a macro name that doesn't include a number in it, like LOGD, since I always get tripped up and type LOGV2 as LOGV@ since I'm holding "Shift" for the uppercase characters. Again, this is a really small nitpick but I figured it's worth mentioning. I'd be fine with the macro I proposed above, if we think that is reasonable and satisfies the necessary use cases for the interested parties.

Comment by Billy Donahue [ 07/Apr/20 ]

I'll try to address a few of the requests in here individually and maybe there's something we can do to help, though it may not end up being what you're asking for. Structured logs are brand new, and we are going to need a handful of little improvements and utilities as we go. We had to add DynamicAttributes on the fly, and make Lobster viewer upgrades for reading the noisy JSON lines.

Syntactically, this `logv2d()` isn't like printf, and it's not libfmt/std::format, it's ostreams. You use printf in quotes to say it doesn't matter, but I think it does matter what we go with, since logv2 doesn't use printf nor ostream syntax. The syntax matters, so I'd like to know what you're proposing more concretely. The macro name is just appearing at the beginning of your logging expression because you are using ostream, otherwise it would have to be an enclosing function-like macro.

You want to generate ephemeral log message IDs. Maybe share a tool that can do this? Like you write some magic keyword into your code as you go and maybe a buildscripts/ tool walks your git tree replacing that keyword with sequential log ids. There are many tool designs possible. I think that's a worthy usability problem to solve.

Your request is strictly syntactical and on the authoring side, since you don't mention output format. So if these statements GENERATED structured logs you'd be okay with that, you just don't like the syntax we use to generate them I think?

You have a request to not have to use ""_attr suffix to make attributes.  There are other ways to make an AttributeStorage object. If you think of a good syntax, let's see it! I think whatever you're printing will need to be at least a LITTLE self-describing, like your "v1" and "v2", and these might as well be attribute names.

Maybe something very short can expand into a log message containing a single string, which would take some of the drudgery out of it. Like:

#define LOGV2_FMT(id, fmt, ...) LOGV2(id, "{dbg}", "dbg"_attr = fmt::format(fmt, __VA_ARGS__))

(and you'd still have to come up with your own `id` and write a Python-style format string.)

Example:
    auto v1 = 20;
    auto v2 = Hours{24};
    LOGV2_FMT(NEWID, "{:0}, {:0}, {:1} to go", v1, v2);

Where NEWID is a mark that a script knows to look for and replace.

PS: no lowercase macros please.
PPS: Your example has attribute names v1 and v2 already, so the LOGV2 equivalent would be about the same in effort.

These are just an acknowledgement of the usability problem and quick thoughts on possible approaches to mitigate it. I feel we all "own" this stuff and should feel empowered to contribute usability fixes when something's annoying to use. But let's get to a more concrete proposal.

Generated at Thu Feb 08 05:13:52 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.