Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-25986

Failed chunk moves should not leave behind files on disk

    • Type: Icon: Improvement Improvement
    • Resolution: Won't Do
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: Sharding
    • Labels:
      None
    • Sharding EMEA

      When a chunk move fails, mongo leaves behind files named "preCleanup.$timestamp.bson" inside the $DBPATH/moveChunk. Often times, the reason why the chunk move fails is not a transient condition, causing the move to fail again if attempted, until the root cause is fixed. When the balancer is enabled, it will choose to move the same chunk to the same destination over and over, failing each time, causing these preCleanup files to be placed on disk and never getting reaped. Over a short period of time (say, a day), this can easily use up all of the available inodes on that filesystem.

      We had this happen over the weekend, and once all the inodes are used, the mongoD will exit and will fail to restart until there are available inodes again. This seems like non ideal behavior, and I think it would be much better if the preCleanup files would also get cleaned up after a failed chunk move instead of allowing them to accumulate on disk.

            Assignee:
            backlog-server-sharding-emea [DO NOT USE] Backlog - Sharding EMEA
            Reporter:
            dai@foursquare.com Dai Shi
            Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

              Created:
              Updated:
              Resolved: