[SERVER-17357] Add flag to enable logging in JSON format Created: 23/Feb/15 Updated: 06/Dec/22 Resolved: 08/Apr/20 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Logging |
| Affects Version/s: | None |
| Fix Version/s: | 4.4.0 |
| Type: | New Feature | Priority: | Major - P3 |
| Reporter: | Travis Redman | Assignee: | DO NOT USE - Backlog - Dev Tools |
| Resolution: | Done | Votes: | 33 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||
| Assigned Teams: |
Developer Tools
|
||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||
| Participants: | |||||||||||||||||
| Case: | (copied to CRM) | ||||||||||||||||
| Description |
|
The mongo logs have a lot of varied information and, as a result, lack a consistent structure. This makes writing accurate and efficient parsers difficult because your parser has to understand how the line will be printed in each context. For instance, queries are logged very differently from updates:
vs.
Outside of the realm of op queries, the lines get even more diverse:
This type of data seems like a good candidate for JSON formatting. It would be helpful to have a config option for having all log lines printed in JSON. This would make parsing straightforward and consistent regardless of the presence or absence of various fields.
|
| Comments |
| Comment by Bryce Jones [ 07/Feb/20 ] | ||||||
|
If changing things from the inside out is so difficult can you add a grok parser that can at least make sense of the logs? I understand changing everything to JSON is tough, however, if you tried to parse your own logs you would say wow these are really non standard and vary wildly from log to log. Then you could make some changes to your logs to make them more uniform and maybe this need goes away as they are simple to parse. | ||||||
| Comment by David Lambert [ 06/Feb/20 ] | ||||||
|
So, this has moved back to "Backlog"? Was really looking forward to it being release in 4.3... Do you have an updated release date? | ||||||
| Comment by Luc Charbonneau [ 25/Nov/19 ] | ||||||
|
Side note: Would be nice if JSON logs can be streamed remotely through Kafka or other means. | ||||||
| Comment by Andrew Morrow (Inactive) [ 11/Apr/19 ] | ||||||
|
Thanks for letting us know that this issue is important to you. My team is actually in the process of planning our upcoming work for the next release, and a project to address this issue is among the items we are considering. As I'm sure you can imagine though, there are many potential projects competing for finite development resources. If we do end up pursuing the project, it will not be a simple change. The issue is that almost all logging in the server codebase is written in the following style:
Here, the object returned by the call to log() is effectively a C++ std::ostream, and the << operators are the standard C++ streaming operators, which incrementally builds a std::string, and then logs that generated std::string as a single item. And there are thousands of such calls. In order to log to other formats like JSON, this needs to change, since once we have stringified and concatenated the arguments it is too late to change the format. Non-exhaustively, and for the purpose of illustration, here are three approaches we might take to change that. We could: 1) Manually re-write all of the logging statements in the codebase to target a wholly new logging framework that is not based on streams. Each of these approaches has its challenges (option one is probably a non-starter), so a proper design will need to be written, probably with some proof of concept work as well to help steer the design choices. It is also important that we ensure that changes to our logging system do not have adverse performance consequences, particularly when dealing with debug logging statements. Such statements are almost always disabled in production so they need to be close to zero overhead, especially in hot paths through the database. The various support and engineering teams at MongoDB also spend a lot of time reading and analyzing MongoDB log files. Making this change would not only benefit our customers, but would be invaluable for us as well, so there is genuine interest in making this change purely on the technical merits. I hope that helps clarify the state of this request. Any specific input or requests you might have regarding the implementation are most welcome. Thanks again for the helpful feedback! | ||||||
| Comment by David Lambert [ 11/Apr/19 ] | ||||||
|
I know my tiny input to this isn't going to make Mongo wake up and do the right thing, but here it is. I too have spent WAY to much time writing a logstash/GROK parser for logs and it's ugly and doesn't catch everything. But it does get the major job done. The idea that a document database company can't write logs out in a document form is just crazy really. Would almost be willing to bet that they are stored internally in json. Sadly, unless they find a way to monetize this, they aren't going to do the right thing... | ||||||
| Comment by Petr Kukrál [ 11/Apr/19 ] | ||||||
|
+1 for this. We are in age of OpenShift and EFK where we send logs of our applications in JSON format. This way is far more easy and managable then creating parsing rules for every specific type of logs. | ||||||
| Comment by Nicholas Whitehead [ 16/Nov/18 ] | ||||||
|
If this issue moves, please add json logging to mongos as well. | ||||||
| Comment by Paul Frischknecht [ 19/Feb/18 ] | ||||||
|
https://github.com/mongodb-js/log attempts to provide a parser. Some motivation to implement this: From
| ||||||
| Comment by Anton [X] [ 06/Feb/18 ] | ||||||
|
Any news on this task? | ||||||
| Comment by Mike Zraly [ 19/Nov/17 ] | ||||||
|
Agreed. I have spent those hours parsing MongoDB logs and it is a frustrating experience, both because of the time I had to spend dealing with the crazy format (query too long to print? we'll overwrite the middle with a 10-period ellipsis and break parseability!) and the time my parsing code has to spend working around it. Why parse logs at all you ask? First, to get detailed collection- and query-level data without incurring the expense of enabling the profiler. I can set the slow query threshold to 0 if I want to look at everything, and that's less of an issue to me than enabling the profiler because log files go to a separate disk so there is no contention for IOPS. Second, to get data that I can't get from serverStatus or collStats commands. For example, which collections or queries are seeing lock contention (acquireWaitCount, timeAcquiringMicros)? That data is only provided at the server level in the serverStatus locks document. It's also kind of ironic that everything else in MongoDB is so JSON-centric, but not the logs. Besides, think about how much easier it would be for you guys to implement per-query analytics in Ops Manager or Atlas if you had an easy-to-parse log format that you could extend without breaking parsing code? Isn't that the promise of using a document format like JSON in the database to begin with? You could then take things one step further and imagine a downstream application that takes the JSON-formatted logs and writes them, or a subset of them, to a capped collection in another MongoDB installation where they can be easily queried. Maybe you could use this to implement something like AWS RDS Performance Insights (https://aws.amazon.com/rds/performance-insights/) without having to enable the profiler. It's all goodness, really. | ||||||
| Comment by Yauhen Artsiukhou [ 19/Nov/17 ] | ||||||
|
Up! Either implement strict log format for everything or make it possible to change format type, like make each log entry in json.
Here <component> field has padding, so I must read it like:
Because of crappy logs we have: I have my own grok. And it is huge, it is ugly, it doesn't work well.
You have to spend couple of hours to implement json or some kind of format where I can exactly extract all fields. | ||||||
| Comment by Yauhen Artsiukhou [ 15/Jun/17 ] | ||||||
|
The same issue: It is impossible to work with mongodb logs. One more case when you have multiple lines and garbage with long query in logs like:
| ||||||
| Comment by pravin dwiwedi [ 28/Mar/17 ] | ||||||
|
Now a days JSON is being supported by almost all the applications/parser. I want to use ELK stack for our log monitoring. Could you please let us know if you have anything on pipeline to provide Log in JSON format? | ||||||
| Comment by Feng Yu [ 11/Dec/15 ] | ||||||
|
+1 For this feature. Mongodb log is hard to parse. Logstash grok pattern could not do well with mongodb log. |