[SERVER-3791] compact always creates a new extent Created: 09/Sep/11 Updated: 11/Jul/16 Resolved: 04/May/12 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Storage |
| Affects Version/s: | None |
| Fix Version/s: | 2.0.6, 2.1.0 |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Aaron Staple | Assignee: | Eric Milkie |
| Resolution: | Done | Votes: | 1 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Participants: | |||||||||
| Description |
|
It looks like the compact command will create a new extent, at least if there isn't an extent on the free list. For large collections, the new extents will be large and usually require a new datafile to be allocated every time compact is run. Experimentally I've run several compact commands back to back and seen a new file get allocated every or almost every time. This was after hitting the max file size, and it seemed like compact was creating a new extent rather than reusing a free list extent in this case. |
| Comments |
| Comment by auto [ 10/May/12 ] |
|
Author: {u'login': u'milkie', u'name': u'Eric Milkie', u'email': u'milkie@10gen.com'}Message: The 'compact' operation sequentially copies all the records in the namespace into a new extent. Since this operation creates no new data, |
| Comment by Eric Milkie [ 19/Dec/11 ] |
|
Resolving this after further discussion. |
| Comment by auto [ 01/Dec/11 ] |
|
Author: {u'login': u'milkie', u'name': u'Eric Milkie', u'email': u'milkie@10gen.com'}Message: The 'compact' operation sequentially copies all the records in the namespace into a new extent. Since this operation creates no new data, The compact speed unit test's time comparison seemed backwards to me, so I fixed it and made it a real assert. |
| Comment by Eric Milkie [ 30/Nov/11 ] |
|
Ok, new methodology. I restored the assert for disk space prior to removing all the indexes. Just prior to this, I set the lastExtentSize down to 0, so that the next extent that gets allocated is the initial extent size. I argue this is the correct thing to do since we are going to end up copying all the records in this namespace, so why not start back at the beginning for extent sizing. |
| Comment by Eric Milkie [ 30/Nov/11 ] |
|
Having second thoughts. This won't work so well if we have an extent with only DeletedRecords inside. I'll need to allocate only before the actual copying of a first record. |
| Comment by Eric Milkie [ 30/Nov/11 ] |
|
My idea seems to work; preallocating an extent with the same size allows you to run compact as much as you want and your datafiles won't grow. After the first compact, running it again will reuse extents in the $freelist. However, I had to remove the safety check for disk space, because that was allocating an extent! This check isn't correct anyway, since it is possible to do a compact without allocating any new extents, as long as you have the right sizes in the $freelist. This scenario is actually quite likely if your data hasn't grown very much since your last compact. |
| Comment by Eric Milkie [ 30/Nov/11 ] |
|
I believe this is happening due to the way the compact code is triggering a new extent. It seems to just clear out the free record bucket lists in each extent to be compacted. When the code then begins to copy records, the first alloc() is forced to allocate a new extent. This would be fine, except that the extend-growth-sizer algorithm kicks in. This is apparently why it's not typically reusing the extents in the $freelist, because we're always making the new extents bigger than the ones we freed (until we hit the growth cap, of course). I think the solution is to manually allocate a new extent that is the same size as the one being compacted. This way, the first alloc() will not trigger the creation of a new growing-size extent. I'm going to experiment with this now. |