[SERVER-14260] Reusing freed extents takes a long time Created: 16/Jun/14  Updated: 11/Jun/19  Resolved: 22/May/15

Status: Closed
Project: Core Server
Component/s: Performance, Storage
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: PHO Assignee: Daniel Pasette (Inactive)
Resolution: Won't Fix Votes: 1
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Participants:
Case:

 Description   

When we repeatedly create and drop large collections, the extent freelist gets very long and the server starts to take a long time to find a suitable freed extent to reuse while the database being write-locked. It can take from 10 sec to several minutes depending on the length of freelist. This is related to SERVER-3022 if I understand it correctly.

warning: newExtent 16365 scanned
warning: slow scan in allocFromFreeList (in write lock)

There would be three approaches to mitigate this problem:

  1. Let users specify the size of initial extent both for collection data and indices to possibly reduce the number of extents to be allocated. This used to be partially achieved by passing size parameter on db.createCollection but it's no longer possible since 2.6.0 (see SERVER-13144). If a collection is supposed to grow to a large size (say 100 MiB), it makes no sense to start from the minimum extent, namely 4096 bytes long.
  2. Merge two freed extents if they are adjacent to each other. This could reduce the length of freelist.
  3. Have a cache of extent freelist on heap for faster searching. It's so slow to do random access on mmapped pages due to repeated page faults and disk seek/reads.


 Comments   
Comment by Greg Murphy [ 02/Apr/15 ]

We're also experiencing this issue on 2.6.8. The database which is causing the problem has thousands of small collections created each day and then deleted a few days later. As a result, it has a huge extent free list. dbstats output for that database is:

{
"collections" : 34345,
"objects" : 5053502,
"avgObjSize" : 741.2781583939217,
"dataSize" : NumberLong("3746050656"),
"storageSize" : NumberLong("6263025664"),
"numExtents" : 45764,
"indexes" : 88726,
"indexSize" : NumberLong(1558075792),
"fileSize" : NumberLong("27837595648"),
"nsSizeMB" : 128,
"dataFileVersion" :

{ "major" : 4, "minor" : 5 }

,
"extentFreeList" :

{ "num" : 74073, "totalSize" : 8233771008 }

,
"ok" : 1
}

Some example log entries from when the server hangs during extent free list searches are:

warning: slow scan in allocFromFreeList (in write lock)
warning: newExtent 12462 scanned

warning: slow scan in allocFromFreeList (in write lock)
warning: newExtent 21047 scanned

warning: slow scan in allocFromFreeList (in write lock)
warning: newExtent 74075 scanned

The server can be unresponsive for anything up to about 5 minutes during these scans.

Comment by PHO [ 18/Jun/14 ]

Thank you for letting me know about SERVER-14082: it is itself informative to me. However, the problem I described above is about extent freelists, not record freelists that reside in each extent.

The last time I saw the problem was when I was using MongoDB 2.4.1. But it has occurred on various versions and I think it still persists in the git HEAD from what I've examined the codebase: MMAP1DatabaseCatalogEntry::createCollection allocates an extent of 4096 bytes long, freed extents will never be merged, and MmapV1ExtentManager::_allocFromFreeList walks through mmapped pages. If you still want me to try 2.6.2 then I will happily do so though.

Comment by Ramon Fernandez Marina [ 17/Jun/14 ]

What version of MongoDB are you using? If you're using 2.6.0 or 2.6.1 you may be running into SERVER-14082, which was fixed in 2.6.2. Is it possible for you to try out 2.6.2 and let us know if the problem persists?

Generated at Thu Feb 08 03:34:18 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.