-
Type:
Improvement
-
Resolution: Won't Do
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: Sharding
-
None
-
Sharding EMEA
-
None
-
3
-
None
-
None
-
None
-
None
-
None
-
None
When a chunk move fails, mongo leaves behind files named "preCleanup.$timestamp.bson" inside the $DBPATH/moveChunk. Often times, the reason why the chunk move fails is not a transient condition, causing the move to fail again if attempted, until the root cause is fixed. When the balancer is enabled, it will choose to move the same chunk to the same destination over and over, failing each time, causing these preCleanup files to be placed on disk and never getting reaped. Over a short period of time (say, a day), this can easily use up all of the available inodes on that filesystem.
We had this happen over the weekend, and once all the inodes are used, the mongoD will exit and will fail to restart until there are available inodes again. This seems like non ideal behavior, and I think it would be much better if the preCleanup files would also get cleaned up after a failed chunk move instead of allowing them to accumulate on disk.