linux kernel memory management strategy on mmapped MongoDataFiles (readahead) controllable from shell with madvise command per DB

XMLWordPrintableJSON

    • Type: New Feature
    • Resolution: Won't Fix
    • Priority: Minor - P4
    • None
    • Affects Version/s: None
    • Component/s: MMAPv1, Storage
    • Storage Execution
    • None
    • 3
    • None
    • None
    • None
    • None
    • None
    • None

      Scenario:

      • 4G memory
      • Single MongoDB instance on Amazon EC2
      • EBS 1000 iops provisioned
      • Big collection with tens of gigabytes of ~4K documents, only with the default _id index on it
      • Completely random (uniformly distributed _id-s) queries by _id

      In this scenario, I measured the following:

      • Almost every query causes a pagefault
      • One query should cause at most two pagefaults because of the 4K pagesize, and the ~4k docsize
        BUT
      • Kernel uses readahead by default on pagefaults, especially when MADV_SEQUENTIAL advice was given
        This means a lot more read on the storage than it would be needed. On Amazon EBS on single connection and single mongo instance I was able to query documents with ~1.5MB/sec net, while I had ~40MB/sec on the EBS storage.

      I made a little patch which makes it possible to set one of the MADV_SEQUENTIAL, MADV_NORMAL or MADV_RANDOM flags on a given database's mmapped MongoDataFiles.
      When I set the MADV_RANDOM with db.$cmd.findOne(

      {madvise: "RANDOM" }

      ); I was able to query with ~3MB/sec net, while I had only ~4MB/sec on the EBS storage. That was a big improvement in my case.

      I thought it worths to share this patch, if anybody ever has the same problem, it could help, or could be a good starting point.

      I tested it only on Ubuntu x86_64 for 1-2 hours, so care must be taken

        1. madvise.patch
          6 kB
          Pék Dániel

            Assignee:
            [DO NOT USE] Backlog - Storage Execution Team
            Reporter:
            Pék Dániel
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated:
              Resolved: