Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-28693

Use getrusage to collect major page faults in serverStatus.extra_info on Linux

    XMLWordPrintable

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 3.5.10
    • Component/s: None
    • Backwards Compatibility:
      Fully Compatible
    • Sprint:
      Platforms 2017-07-10
    • Linked BF Score:
      0

      Description

      In order to return the "page faults" field of serverStatus.extra_info, the /proc/self/stat file is read, and parsed for just a single field. Its cheaper to simply call getrusage instead.

      This should eliminate the stalls seen while collecting "extra_info".

      Example stall:

      [js_test:fsm_all_sharded_replication_with_balancer] 2017-02-13T19:17:48.812+0000 d20010| 2017-02-13T19:17:48.811+0000 I COMMAND  [ftdc] serverStatus was very slow: { after basic: 0, after asserts: 0, after backgroundFlushing: 0, after connections: 0, after dur: 0, after encryptionAtRest: 0, after extra_info: 2546, after globalLock: 2546, after locks: 2546, after network: 2546, after opLatencies: 2546, after opcounters: 2546, after opcountersRepl: 2546, after repl: 2546, after security: 2546, after storageEngine: 2546, after tcmalloc: 2546, after wiredTiger: 2546, at end: 2546 }
      

      Source
      https://github.com/mongodb/mongo/blob/b3e59f004f541242e1778efcbbca704c9f174890/src/mongo/util/processinfo_linux.cpp#L453-L459

        Attachments

          Issue Links

            Activity

              People

              • Votes:
                1 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: