Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-46911

Allow logging to pipe

    • Type: Icon: Improvement Improvement
    • Resolution: Works as Designed
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 4.3.4
    • Component/s: None
    • None
    • Service Arch
    • Fully Compatible

      It is my understanding that in server 4.4 the only log format supported is json. While this format has undisputable benefits when it is parsed by machines, its readability by human beings is often significantly inferior to the old text format.

      For example, the following log output is produced when a 4.2 server starts:

      2020-03-17T00:19:19.719-0400 I  SHARDING [initandlisten] Marking collection admin.system.roles as collec
      tion version: <unsharded>
      2020-03-17T00:19:19.719-0400 I  SHARDING [initandlisten] Marking collection admin.system.version as coll
      ection version: <unsharded>
      2020-03-17T00:19:19.719-0400 I  STORAGE  [initandlisten] createCollection: local.startup_log with genera
      ted UUID: 7206169f-38d1-4217-a27a-0341d594fbbe and options: { capped: true, size: 10485760 }
      2020-03-17T00:19:19.875-0400 I  INDEX    [initandlisten] index build: done building index _id_ on ns loc
      al.startup_log
      2020-03-17T00:19:19.876-0400 I  SHARDING [initandlisten] Marking collection local.startup_log as collect
      ion version: <unsharded>
      2020-03-17T00:19:19.876-0400 I  FTDC     [initandlisten] Initializing full-time diagnostic data capture 
      with directory '/mnt/zram/mongodb/4.2-rs/ruby-driver-rs/rs1/db/diagnostic.data'
      2020-03-17T00:19:19.876-0400 I  STORAGE  [initandlisten] createCollection: local.replset.oplogTruncateAf
      terPoint with generated UUID: a21115fb-ec78-4889-9d24-d84e9815ca43 and options: {}
      2020-03-17T00:19:20.042-0400 I  INDEX    [initandlisten] index build: done building index _id_ on ns loc
      al.replset.oplogTruncateAfterPoint
      

      And the following is the same output from a 4.4 server:

      {"t":{"$date":"2020-03-15T20:08:31.317-0400"},"s":"I", "c":"INDEX",   "id":20345,"ctx":"initandlisten","msg":"index build: done building index {indexName} on ns {nss}","attr":{"indexName":"_id_","nss":"local.startup_log"}}
      {"t":{"$date":"2020-03-15T20:08:31.326-0400"},"s":"I", "c":"FTDC",    "id":20625,"ctx":"initandlisten","msg":"Initializing full-time diagnostic data capture with directory '{path_generic_string}'","attr":{"path_generic_string":"/mnt/zram/mongodb/4.4-rs/ruby-driver-rs/rs1/db/diagnostic.data"}}
      {"t":{"$date":"2020-03-15T20:08:31.327-0400"},"s":"I", "c":"STORAGE", "id":20320,"ctx":"initandlisten","msg":"createCollection: {nss} with {generatedUUID_generated_provided} UUID: {optionsWithUUID_uuid_get} and options: {options}","attr":{"nss":"local.replset.oplogTruncateAfterPoint","generatedUUID_generated_provided":"generated","optionsWithUUID_uuid_get":{"uuid":{"$uuid":"ad7a88e7-8ff1-42bf-bc9a-75c7526e6cfe"}},"options":{}}}
      {"t":{"$date":"2020-03-15T20:08:31.499-0400"},"s":"I", "c":"INDEX",   "id":20345,"ctx":"initandlisten","msg":"index build: done building index {indexName} on ns {nss}","attr":{"indexName":"_id_","nss":"local.replset.oplogTruncateAfterPoint"}}
      {"t":{"$date":"2020-03-15T20:08:31.500-0400"},"s":"I", "c":"STORAGE", "id":20320,"ctx":"initandlisten","msg":"createCollection: {nss} with {generatedUUID_generated_provided} UUID: {optionsWithUUID_uuid_get} and options: {options}","attr":{"nss":"local.replset.minvalid","generatedUUID_generated_provided":"generated","optionsWithUUID_uuid_get":{"uuid":{"$uuid":"eef6f8db-fcea-49c8-a4b5-c6128d6118e9"}},"options":{}}}
      {"t":{"$date":"2020-03-15T20:08:31.708-0400"},"s":"I", "c":"INDEX",   "id":20345,"ctx":"initandlisten","msg":"index build: done building index {indexName} on ns {nss}","attr":{"indexName":"_id_","nss":"local.replset.minvalid"}}
      {"t":{"$date":"2020-03-15T20:08:31.741-0400"},"s":"I", "c":"STORAGE", "id":20320,"ctx":"initandlisten","msg":"createCollection: {nss} with {generatedUUID_generated_provided} UUID: {optionsWithUUID_uuid_get} and options: {options}","attr":{"nss":"local.replset.election","generatedUUID_generated_provided":"generated","optionsWithUUID_uuid_get":{"uuid":{"$uuid":"fa4d0d0f-7a79-4494-975c-be59735e92f7"}},"options":{}}}
      {"t":{"$date":"2020-03-15T20:08:31.911-0400"},"s":"I", "c":"INDEX",   "id":20345,"ctx":"initandlisten","msg":"index build: done building index {indexName} on ns {nss}","attr":{"indexName":"_id_","nss":"local.replset.election"}}
      {"t":{"$date":"2020-03-15T20:08:31.911-0400"},"s":"I", "c":"REPL",    "id":21311,"ctx":"initandlisten","msg":"Did not find local initialized voted for document at startup."}
      {"t":{"$date":"2020-03-15T20:08:31.911-0400"},"s":"I", "c":"REPL",    "id":21312,"ctx":"initandlisten","msg":"Did not find local Rollback ID document at startup. Creating one."}
      {"t":{"$date":"2020-03-15T20:08:31.911-0400"},"s":"I", "c":"STORAGE", "id":20320,"ctx":"initandlisten","msg":"createCollection: {nss} with {generatedUUID_generated_provided} UUID: {optionsWithUUID_uuid_get} and options: {options}","attr":{"nss":"local.system.rollback.id","generatedUUID_generated_provided":"generated","optionsWithUUID_uuid_get":{"uuid":{"$uuid":"47dbd567-1290-46d5-84c2-e56785c023ae"}},"options":{}}}
      {"t":{"$date":"2020-03-15T20:08:32.049-0400"},"s":"I", "c":"INDEX",   "id":20345,"ctx":"initandlisten","msg":"index build: done building index {indexName} on ns {nss}","attr":{"indexName":"_id_","nss":"local.system.rollback.id"}}
      

      A quick glance at the 4.2 log is sufficient to determine what is happening (index builds). The 4.4 log is a wall of text and I cannot get a quick assessment for what the logs are saying at all (in other words, I need to carefully read each log entry to figure out what it is referring to).

      In my opinion the loss of readability is due to:

      1. The log entries becoming much longer. This increases the amount of text that must be processed (while the effective payload remains the same).
      2. Since the log entries have become longer, frequently the important information is no longer in the first 100 or so columns that a user likely has in their view. This means the user either has to scroll right all the time, or the user has to read the logs with lines wrapped. Wrapping lines loses the structure that the text format provided (timestamp, severity, facility etc. aligned vertically) even more and makes the logs again harder to read.
      3. The information is no longer presented continuously because of the separation of formatting strings and the data. For example, "index build: done building index id on ns local.startup_log" became "msg":"index build: done building index

      {indexName}

      on ns

      {nss}

      ","attr":

      {"indexName":"_id_","nss":"local.system.rollback.id"}

      . This means the reader must jump back and forth constantly as they are reading the entries.

      As an engineer who develops drivers for the server, as well as a system administrator who reads server logs to troubleshoot the server, I find that the loss of real-time logging in text format has a major negative effect on my productivity.

      There are various tools available for converting json logs to text logs (l2, https://github.com/markbenvenuto/mrlog). Ideally, I would like to send logs to a program that would 1) output json logs unmodified to one file, and 2) convert json log to a text log and store the result in another file. While I am unaware of a program that specifically does this, this seems like a straightforward problem to solve. However, there is the challenge of getting the log to such a program.

      The server currently offers three options for log destination:

      • stdout
      • syslog
      • file

      Stdout would be the most suitable destination, since I would be able to trivially pipe the logs to the log splitter program. Unfortunately stdout logging is not allowed when --fork is used, which is the subject of this ticket. It is possible to launch the server without --fork by manually implementing the daemonization externally, but given that I believe there is an easy to implement alternative in the server that allows file/pipe output with forking I believe the server should allow logging to stdout when --fork is used.

      Syslog logging is suitable for production installations when there is only a single mongod/mongos process that can be globally configured. It is not generally suitable for development because I can have over 20 mongod processes running in various deployments including scratch ones and configuring syslog for each of them is untenable.

      File destination requires the log splitter to know when the log is rotated by the server, so that the splitter can reopen the current log file. I am not aware how the splitter might be notified of this event by the server directly, although I imagine a workaround may be devised by watching for changes in the file system in the log directory. The splitter would also need to rename its log files correctly when the server rotates logs which would be error-prone.

      Expected behavior:

      As an engineer developing drivers for the server and as a system administrator, I would like to have mongod/mongos etc. be able to log to pipe, such as by logging to standard output which I will redirect to a pipe, even when the server forks.

      Implementation:

      I believe it will be sufficient to prohibit logging to stdout when stdout is a tty. If stdout is not a tty, in my understanding it is redirected to either a file or a pipe and hence the logs will be consumed by something and will not be lost, such that it does not seem necessary to prohibit these cases.

            Assignee:
            backlog-server-servicearch [DO NOT USE] Backlog - Service Architecture
            Reporter:
            oleg.pudeyev@mongodb.com Oleg Pudeyev (Inactive)
            Votes:
            1 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated:
              Resolved: