Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-8966

DataFileSync thread and default value of syncdelay option don't work effectively on Linux.

    • Type: Icon: Bug Bug
    • Resolution: Won't Fix
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 2.4.0-rc0
    • Component/s: Storage
    • Labels:
    • Environment:
      Linux centos6 2.6.32-220.el6.x86_64

    • ALL

      This topic mention about only behavior of the mongod on Linux.

      As you know, the mongod's storage engine uses mmap() and also uses msync() to write back to physical disk by "DataFileSync" thread basically.

      In addition, we can control the interval of the msync() by using "--syncdelay" option.

      The default value of the syncdelay is 60 , It means msync() will be called once per minutes.

      But updated mapped pages will be written back automatically by kernel at least less than 30 seconds in recent Linux.

      We can confirm this settings by below commands.

      $ sysctl vm.dirty_expire_centisecs
      vm.dirty_expire_centisecs = 3000
      $ sysctl vm.dirty_writeback_centisecs
      vm.dirty_writeback_centisecs = 500
      

      These values are in hundredths of a second.

      My proposal

      1. Change the default value of syncdelay from 60 to 0.
        The value of 0 means that it depend on your system.
      2. Revise manual appropriately.

      Supplementary explanations

      Detail of the behavior of mmap() (mapped pages).

      1. Start mongod with syncdelay=0.
      2. The page will be marked the dirty flag by kernel when the mongod updates the part of its mapped memory.
      3. Pages that are marked by dirty flag after certain period (vm.dirty_expire_centisecs=30secs) of time will be considered expired and must be written at the next opportunity.
      4. Next opportunity : Kernel kicks writeback process at regular (vm.dirty_writeback_centisecs=5secs) interval.

      how to confirm.

      1. Update a correction per seconds.
        for i in {0..10000}; do
        mongo 127.0.0.1:27017 <<< "use testdb
        db.testcol.save({key:$i})";
        sleep 1
        done;
        
      2. Check DB file stat.
         stat data/testdb/testdb.0 -c'%y'
        
        The timing of the DB file update (when default)

        sysctl vm.dirty_expire_centisecs=3000
        sysctl vm.dirty_writeback_centisecs=500

        2013-03-13 00:20:57.441599513 -0700
        2013-03-13 00:21:04.838052872 -0700
        2013-03-13 00:21:29.125808067 -0700
        2013-03-13 00:21:34.408625047 -0700
        2013-03-13 00:21:42.857941700 -0700
        2013-03-13 00:21:59.761572768 -0700
        2013-03-13 00:22:00.817927798 -0700
        2013-03-13 00:22:28.286594599 -0700
              :
        
        The timing of the DB file update (modify kernel params as writing back per 5 seconds )

        sysctl vm.dirty_expire_centisecs=500
        sysctl vm.dirty_writeback_centisecs=100

        2013-03-13 00:23:09.481541037 -0700
        2013-03-13 00:23:14.760541494 -0700
        2013-03-13 00:23:20.040029026 -0700
        2013-03-13 00:23:24.267557606 -0700
        2013-03-13 00:23:29.551242097 -0700
        2013-03-13 00:23:34.832939305 -0700
        2013-03-13 00:23:35.888281140 -0700
        2013-03-13 00:23:40.109627939 -0700
        2013-03-13 00:23:45.389313622 -0700
        2013-03-13 00:23:50.671000401 -0700
        2013-03-13 00:23:51.726337540 -0700
           :
        
        The timing of the DB file update (modify kernel params as writing back per 1 seconds )

        sysctl vm.dirty_expire_centisecs=100
        sysctl vm.dirty_writeback_centisecs=100

        2013-03-13 00:28:44.298270234 -0700
        2013-03-13 00:28:45.353080480 -0700
        2013-03-13 00:28:46.409875411 -0700
        2013-03-13 00:28:47.467720763 -0700
        2013-03-13 00:28:47.467720763 -0700
        2013-03-13 00:28:48.527613888 -0700
        2013-03-13 00:28:49.583300450 -0700
        2013-03-13 00:28:50.640146646 -0700
        2013-03-13 00:28:51.697976987 -0700
        2013-03-13 00:28:52.754726703 -0700
        2013-03-13 00:28:53.809552595 -0700
          :
        

            Assignee:
            dan@mongodb.com Daniel Pasette (Inactive)
            Reporter:
            crumbjp Hiroaki
            Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

              Created:
              Updated:
              Resolved: