[SERVER-35987] Compact cannot work well when there is 60% free space in the collection Created: 06/Jul/18  Updated: 27/Oct/23  Resolved: 24/Nov/19

Status: Closed
Project: Core Server
Component/s: Storage
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Zhang Youdong Assignee: Dmitry Agranat
Resolution: Works as Designed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Backwards Compatibility: Fully Compatible
Participants:

 Description   

I run `compact` command on a collection with stats as below.

SECONDARY> db.somecoll.stats()
{
  "ns" : "dbdoc.somecoll",
  "count" : 221435,
  "size" : 29477552228,
  "avgObjSize" : 133120,
  "storageSize" : 107902332928,
  "capped" : false,
  "wiredTiger" : {
    "metadata" : {
      "formatVersion" : 1
    },
    "creationString" : "access_pattern_hint=none,allocation_size=4KB,app_metadata=(formatVersion=1),block_allocation=best,block_compressor=snappy,cache_resident=false,checksum=on,colgroups=,collator=,columns=,dictionary=0,encryption=(keyid=,name=),exclusive=false,extractor=,format=btree,huffman_key=,huffman_value=,ignore_in_memory_cache_size=false,immutable=false,internal_item_max=0,internal_key_max=0,internal_key_truncate=true,internal_page_max=4KB,key_format=q,key_gap=10,leaf_item_max=0,leaf_key_max=0,leaf_page_max=32KB,leaf_value_max=64MB,log=(enabled=true),lsm=(auto_throttle=true,bloom=true,bloom_bit_count=16,bloom_config=,bloom_hash_count=8,bloom_oldest=false,chunk_count_limit=0,chunk_max=5GB,chunk_size=10MB,merge_max=15,merge_min=0),memory_page_max=10m,os_cache_dirty_max=0,os_cache_max=0,prefix_compression=false,prefix_compression_min=4,source=,split_deepen_min_child=0,split_deepen_per_child=0,split_pct=90,type=file,value_format=u",
    "type" : "file",
    "uri" : "statistics:table:dbdoc/collection-78--3418042999270868539",
    "LSM" : {
      "bloom filter false positives" : 0,
      "bloom filter hits" : 0,
      "bloom filter misses" : 0,
      "bloom filter pages evicted from cache" : 0,
      "bloom filter pages read into cache" : 0,
      "bloom filters in the LSM tree" : 0,
      "chunks in the LSM tree" : 0,
      "highest merge generation in the LSM tree" : 0,
      "queries that could have benefited from a Bloom filter that did not exist" : 0,
      "sleep for LSM checkpoint throttle" : 0,
      "sleep for LSM merge throttle" : 0,
      "total size of bloom filters" : 0
    },
    "block-manager" : {
      "allocations requiring file extension" : 0,
      "blocks allocated" : 0,
      "blocks freed" : 0,
      "checkpoint size" : 29916192768,
      "file allocation unit size" : 4096,
      "file bytes available for reuse" : 8498855936,
      "file magic number" : 120897,
      "file major version number" : 1,
      "file size in bytes" : 107902332928,
      "minor version number" : 0
    },
    "btree" : {
      "btree checkpoint generation" : 0,
      "column-store fixed-size leaf pages" : 0,
      "column-store internal pages" : 0,
      "column-store variable-size RLE encoded values" : 0,
      "column-store variable-size deleted values" : 0,
      "column-store variable-size leaf pages" : 0,
      "fixed-record size" : 0,
      "maximum internal page key size" : 368,
      "maximum internal page size" : 4096,
      "maximum leaf page key size" : 2867,
      "maximum leaf page size" : 32768,
      "maximum leaf page value size" : 67108864,
      "maximum tree depth" : 0,
      "number of key/value pairs" : 0,
      "overflow pages" : 0,
      "pages rewritten by compaction" : 0,
      "row-store internal pages" : 0,
      "row-store leaf pages" : 0
    },
    "cache" : {
      "bytes currently in the cache" : 213732,
      "bytes read into cache" : 176748,
      "bytes written from cache" : 0,
      "checkpoint blocked page eviction" : 0,
      "data source pages selected for eviction unable to be evicted" : 0,
      "hazard pointer blocked page eviction" : 0,
      "in-memory page passed criteria to be split" : 0,
      "in-memory page splits" : 0,
      "internal pages evicted" : 0,
      "internal pages split during eviction" : 0,
      "leaf pages split during eviction" : 0,
      "modified pages evicted" : 0,
      "overflow pages read into cache" : 0,
      "overflow values cached in memory" : 0,
      "page split during eviction deepened the tree" : 0,
      "page written requiring lookaside records" : 0,
      "pages read into cache" : 4,
      "pages read into cache requiring lookaside entries" : 0,
      "pages requested from the cache" : 3,
      "pages written from cache" : 0,
      "pages written requiring in-memory restoration" : 0,
      "tracked dirty bytes in the cache" : 0,
      "unmodified pages evicted" : 0
    },
    "cache_walk" : {
      "Average difference between current eviction generation when the page was last considered" : 0,
      "Average on-disk page image size seen" : 0,
      "Clean pages currently in cache" : 0,
      "Current eviction generation" : 0,
      "Dirty pages currently in cache" : 0,
      "Entries in the root page" : 0,
      "Internal pages currently in cache" : 0,
      "Leaf pages currently in cache" : 0,
      "Maximum difference between current eviction generation when the page was last considered" : 0,
      "Maximum page size seen" : 0,
      "Minimum on-disk page image size seen" : 0,
      "On-disk page image sizes smaller than a single allocation unit" : 0,
      "Pages created in memory and never written" : 0,
      "Pages currently queued for eviction" : 0,
      "Pages that could not be queued for eviction" : 0,
      "Refs skipped during cache traversal" : 0,
      "Size of the root page" : 0,
      "Total number of pages currently in cache" : 0
    },
    "compression" : {
      "compressed pages read" : 0,
      "compressed pages written" : 0,
      "page written failed to compress" : 0,
      "page written was too small to compress" : 0,
      "raw compression call failed, additional data available" : 0,
      "raw compression call failed, no additional data available" : 0,
      "raw compression call succeeded" : 0
    },
    "cursor" : {
      "bulk-loaded cursor-insert calls" : 0,
      "create calls" : 1,
      "cursor-insert key and value bytes inserted" : 0,
      "cursor-remove key bytes removed" : 0,
      "cursor-update value bytes updated" : 0,
      "insert calls" : 0,
      "next calls" : 0,
      "prev calls" : 1,
      "remove calls" : 0,
      "reset calls" : 1,
      "restarted searches" : 0,
      "search calls" : 0,
      "search near calls" : 0,
      "truncate calls" : 0,
      "update calls" : 0
    },
    "reconciliation" : {
      "dictionary matches" : 0,
      "fast-path pages deleted" : 0,
      "internal page key bytes discarded using suffix compression" : 0,
      "internal page multi-block writes" : 0,
      "internal-page overflow keys" : 0,
      "leaf page key bytes discarded using prefix compression" : 0,
      "leaf page multi-block writes" : 0,
      "leaf-page overflow keys" : 0,
      "maximum blocks required for a page" : 0,
      "overflow values written" : 0,
      "page checksum matches" : 0,
      "page reconciliation calls" : 0,
      "page reconciliation calls for eviction" : 0,
      "pages deleted" : 0
    },
    "session" : {
      "object compaction" : 0,
      "open cursor count" : 1
    },
    "transaction" : {
      "update conflicts" : 0
    }
  }

 

It's very strange that the compaction cannot proceed because there is less 10%(10.7GB) free space in the first 90% physical space, there is only one internal checkpoint exists in WT, so why compaction cannot work with at least 60% free space.   ("size" : 29477552228, , "storageSize" : 107902332928)

Another node in this replica set reached the 10% threshold and compact successfully, the phsical space after compaction is 37GB (far more less then 107GB).

 

2018-07-06T13:01:53.316+0800 I REPL [ReplicationExecutor] going into maintenance mode with 0 other maintenance mode tasks in progress 2018-07-06T13:01:53.316+0800 I REPL [ReplicationExecutor] transition to RECOVERING 2018-07-06T13:01:53.316+0800 I COMMAND [conn4] compact dbdoc.somecoll begin, options: paddingMode: NONE validateDocuments: 1 2018-07-06T13:01:53.367+0800 I ACCESS [conn12] Successfully authenticated as principal __cloud_internal on admin 2018-07-06T13:01:53.375+0800 I STORAGE [conn4] WiredTiger [1530853313:375677][21031:0x7effc9e7c700], file:dbdoc/collection-78--3418042999270868539.wt, WT_SESSION.compact: ============ testing for compaction 2018-07-06T13:01:53.375+0800 I STORAGE [conn4] WiredTiger [1530853313:375742][21031:0x7effc9e7c700], file:dbdoc/collection-78--3418042999270868539.wt, WT_SESSION.compact: file size 102903MB (107902332928) with 7% space available 8105MB (8498855936) 2018-07-06T13:01:53.467+0800 I NETWORK [conn12] end connection 127.0.0.1:44591 (6 connections now open) 2018-07-06T13:01:53.467+0800 I NETWORK [conn11] end connection 127.0.0.1:44582 (6 connections now open) 2018-07-06T13:01:53.764+0800 I STORAGE [conn4] WiredTiger [1530853313:764520][21031:0x7effc9e7c700], file:dbdoc/collection-78--3418042999270868539.wt, WT_SESSION.compact: 0%: 3MB, (3489792B, 0%) 2018-07-06T13:01:53.764+0800 I STORAGE [conn4] WiredTiger [1530853313:764587][21031:0x7effc9e7c700], file:dbdoc/collection-78--3418042999270868539.wt, WT_SESSION.compact: 10%: 2204MB, (2311172096B, 27%) 2018-07-06T13:01:53.764+0800 I STORAGE [conn4] WiredTiger [1530853313:764603][21031:0x7effc9e7c700], file:dbdoc/collection-78--3418042999270868539.wt, WT_SESSION.compact: 20%: 1580MB, (1657495552B, 19%) 2018-07-06T13:01:53.764+0800 I STORAGE [conn4] WiredTiger [1530853313:764615][21031:0x7effc9e7c700], file:dbdoc/collection-78--3418042999270868539.wt, WT_SESSION.compact: 30%: 1539MB, (1614770176B, 18%) 2018-07-06T13:01:53.764+0800 I STORAGE [conn4] WiredTiger [1530853313:764627][21031:0x7effc9e7c700], file:dbdoc/collection-78--3418042999270868539.wt, WT_SESSION.compact: 40%: 692MB, (726429696B, 8%) 2018-07-06T13:01:53.764+0800 I STORAGE [conn4] WiredTiger [1530853313:764665][21031:0x7effc9e7c700], file:dbdoc/collection-78--3418042999270868539.wt, WT_SESSION.compact: 50%: 824MB, (864960512B, 10%) 2018-07-06T13:01:53.764+0800 I STORAGE [conn4] WiredTiger [1530853313:764678][21031:0x7effc9e7c700], file:dbdoc/collection-78--3418042999270868539.wt, WT_SESSION.compact: 60%: 444MB, (465973248B, 5%) 2018-07-06T13:01:53.764+0800 I STORAGE [conn4] WiredTiger [1530853313:764690][21031:0x7effc9e7c700], file:dbdoc/collection-78--3418042999270868539.wt, WT_SESSION.compact: 70%: 12MB, (12582912B, 0%)2018-07-06T13:01:53.764+0800 I STORAGE [conn4] WiredTiger [1530853313:764702][21031:0x7effc9e7c700], file:dbdoc/collection-78--3418042999270868539.wt, WT_SESSION.compact: 80%: 9MB, (10027008B, 0%)
2018-07-06T13:01:53.764+0800 I STORAGE [conn4] WiredTiger [1530853313:764714][21031:0x7effc9e7c700], file:dbdoc/collection-78--3418042999270868539.wt, WT_SESSION.compact: 90%: 793MB, (831954944B, 9%)
265911 2018-07-06T13:01:53.764+0800 I STORAGE [conn4] WiredTiger [1530853313:764775][21031:0x7effc9e7c700], file:dbdoc/collection-78--3418042999270868539.wt, WT_SESSION.compact: dbdoc/collection-78--3418042999270868539.wt: 7302MB (7656873984) available space in the first 80% of the file
2018-07-06T13:01:53.764+0800 I STORAGE [conn4] WiredTiger [1530853313:764796][21031:0x7effc9e7c700], file:dbdoc/collection-78--3418042999270868539.wt, WT_SESSION.compact: dbdoc/collection-78--3418042999270868539.wt: 7311MB (7666900992) available space in the first 90% of the file
2018-07-06T13:01:53.764+0800 I STORAGE [conn4] WiredTiger [1530853313:764808][21031:0x7effc9e7c700], file:dbdoc/collection-78--3418042999270868539.wt, WT_SESSION.compact: dbdoc/collection-78--3418042999270868539.wt: require 10% or 10290MB (10790233292) in the first 90% of the file to perform compaction , compaction skipped
2018-07-06T13:01:53.764+0800 I STORAGE [conn4] WiredTiger [1530853313:764825][21031:0x7effc9e7c700], file:dbdoc/collection-78--3418042999270868539.wt, WT_SESSION.compact: ============ ending compaction pass
2018-07-06T13:01:53.764+0800 I STORAGE [conn4] WiredTiger [1530853313:764837][21031:0x7effc9e7c700], file:dbdoc/collection-78--3418042999270868539.wt, WT_SESSION.compact: pages reviewed: 0
265916 2018-07-06T13:01:53.764+0800 I STORAGE [conn4] WiredTiger [1530853313:764850][21031:0x7effc9e7c700], file:dbdoc/collection-78--3418042999270868539.wt, WT_SESSION.compact: pages skipped: 0
265917 2018-07-06T13:01:53.764+0800 I STORAGE [conn4] WiredTiger [1530853313:764861][21031:0x7effc9e7c700], file:dbdoc/collection-78--3418042999270868539.wt, WT_SESSION.compact: pages written: 0



 Comments   
Comment by Zhang Youdong [ 31/Dec/19 ]

I have figured it out,there is a backup cursor opend in wiredtiger,so the space cannot be reclaimed. 

Comment by Dmitry Agranat [ 24/Nov/19 ]

Hi,

We haven’t heard back from you for some time, so I’m going to mark this ticket as resolved. If this is still an issue for you, please provide additional information and we will reopen the ticket.

Regards,
Dima

Comment by Dmitry Agranat [ 06/Nov/19 ]

Hi zyd_com,

This behavior you've mentioned is by design as appears here and we found that more aggressive compaction is less useful.

Another node in this replica set reached the 10% threshold and compact successfully, the phsical space after compaction is 37GB (far more less then 107GB).

This "another node" might have had different storage statistics at the time the compact command was executed.

If this is still an issue for you, please provide archived $dbpath/diagnostic.data directory (the contents are described here) from both nodes in question and upload them to this support uploader location. In addition, please provide the output from db.collection.stats() from both collections prior to executing the compact command.

Thanks,
Dima

Generated at Thu Feb 08 04:41:40 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.