Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-14053

implementation of server logRotate command breaks O_APPEND semantics which breaks the logrotate utility's copytruncate

    • Type: Icon: Bug Bug
    • Resolution: Won't Fix
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: Logging
    • None
    • ALL

      I am seeing a totally bizarre interaction with mongod and logrotate's copytruncate which I could only reproduce occasionally where the truncated file would be sparse and full of NULL bytes till the point it got rotated. It would only happen on some boxes. After some digging it turns out that running logRotate at any point in time breaks O_APPEND and all writes keep going to the last_offset that was written to instead.

      • logrotate is configured to rotate mongod.log when it hits 1G
      • logrotate runs through a cron job every 15 min
      • since SIGUSR1 doesn't reopen the file I make logrotate do copytruncate instead. That just moves mongod.log to mongod.log1. logrotate does something like:
      while ((cnt = read(fdcurr, buf, sizeof(buf))) > 0)
          write(fdsave, buf, cnt)
      
      ftruncate(currLog, 0)
      
      • Now since I have the logappend option to mongod enabled I'd expect it to do O_APPEND on open. Which it does:
      open("/home2/akshay/logtest/mongod.log", O_WRONLY|O_CREAT|O_APPEND, 0666) = 4
      

      All good so far. Now I'd expect the rotated log file to be size 0. Since we opened with O_APPEND the file_offset in the file table entry will be reset to 0 on the next write since that's the new size in the i-node table entry.

      Instead what I get is a sparse file with all null bytes at the beginning. So if logrotate happened to run at 1.2G I have a sparse mongod.log that is 1.2G. Doesn't take up any extra space yet cause ext4 supports sparse files which is all fine till the next iteration.

      Now when logrotate comes around this time and starts copying, the new rotated file (mongod.log.x) is exploded, it's not longer sparse. The original logfile (mongod.log) is now 2.x G and sparse and the cycle continues with each subsequent rotated file (mongod.log.x) getting bigger and bigger until it runs out of diskspace.

      The only way for this to happen is if something keeps writing to the last_offset thereby bypassing the O_APPEND semantics. Turns out it was only happening on boxes where I had previously run the server's logRotate command:

      db.runCommand({logRotate:1})
      

      To reproduce:

      • start a mongod with --logappend
      • let it log for a bit, at least over 4k or so, you can increase verbosity
      • run ftruncate mongod.log and do head -n 2 mongod.log, everything should be fine and you should see the first couple of lines
      • let mongod log for a bit, do db.runCommand( {logRotate:1}

        ) and again let it log for a bit

      • do ftruncate mongod.log another time

      You should now see a sparse file, ls -lhs should show you the size on disk vs the actual size. Now do od -c mongod.log and you should see something like the following.

      All bytes till 0160220 are NULL.

      [akshay@maverick logtest]$ od -c mongod.log
      0000000  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
      *
      0160220   2   0   1   4   -   0   5   -   2   5   T   1   2   :   1   8
      0160240   :   5   9   .   5   6   9   -   0   4   0   0       [   j   o
      0160260   u   r   n   a   l   ]       _   g   r   o   u   p   C   o   m
      

            Assignee:
            Unassigned Unassigned
            Reporter:
            akshay@mongodb.com Akshay Kumar
            Votes:
            2 Vote for this issue
            Watchers:
            12 Start watching this issue

              Created:
              Updated:
              Resolved: