[SERVER-39656] storageSize on secondary is much bigger than on primary Created: 19/Feb/19  Updated: 16/May/19  Resolved: 16/May/19

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 4.0.6
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Patrik Susko Assignee: Danny Hatcher (Inactive)
Resolution: Won't Fix Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File 3.6.10-primary.json     File 3.6.10-secondary.json     File 4.0.6-2ndRun-primary.json     File 4.0.6-2ndRun-secondary.json     File 4.0.6-primary.json     File 4.0.6-secondary.json     File 4.1.8-primary.json     File 4.1.8-secondary.json     Zip Archive configs.zip     File testing.tar.gz    
Operating System: ALL
Steps To Reproduce:
  1. mongorestore one bigger collection into a replica set
  2. compare storageSize on primary with storageSize on secondary
Sprint: Storage NYC 2019-05-06
Participants:

 Description   

Hi,

I restored one collection with mongorestore into a primary-secondary-arbiter (PSA) replicaset. The used disk space (storageSize) on the secondary is unusually big.

  • I would expect similar storage Sizes between nodes.
  • 4.0.6 storageSize is exteme.

 

Data:

version storageSize PRIMARY storageSize SECONDARY
3.6.10 3784097792 5039255552
4.0.6 3758616576 9119735808
4.1.8 3813023744 5171081216

Index (_id only):

version totalIndexSize PRIMARY totalIndexSize SECONDARY
3.6.10 505344000 501075968
4.0.6 501493760 2110971904
4.1.8 561000448 555438080

 

I also tested with block_compressor=none on 4.0.6 and got similar results (much bigger files on secondary).

Initial sync is apparently not affected, addind a new secondary node after the collection is already imported creates similarly sized files on the newly added node.

Running .reIndex() on the secondary fixes the totalIndexSize to aproximately the same size as the index on primary, however the data storageSize remains - as expected - unaffected.



 Comments   
Comment by Danny Hatcher (Inactive) [ 16/May/19 ]

I've done some further testing with 1GB of WT cache allocated to the process (and read concern majority disabled). Setting replBatchLimitOperations to 100 made the Secondary 70% the size of the Secondary that didn't have that setting. Increasing the eviction_dirty_target to 10% made a negligible impact on storage size but that is likely due to the small cache size rendering that setting effectively meaningless under the test load.

As Geert pointed out in his last comment, this effect is expected due to the page-splitting that occurs. As this issue does not manifest through initial syncs, the recommendation for people who run into this problem after a large data-load is to perform an initial sync of any Secondaries that need to recover storage space. It is possible that future work on the server will improve this situation but we do not plan on addressing this ticket specifically.

Comment by Geert Bosch [ 26/Apr/19 ]

My theory is that the combination of a relatively small memory size and dirty ratio, together with a largish buffer size for oplog application and the requirement to keep documents in cache until majority confirmed, leads to memory pressure on the secondary and to aggressive eviction. Setting the replWriterThreadCount to 1, removes the out-of-order application that is normally occurring on secondaries allowing more write-combining and less page splitting. I think reducing replBatchLimitOperations to maybe 100 or so might help and result in better throughput on the secondary.

Comment by deyukong [ 25/Apr/19 ]

tunning slave's replWriterThreadCount to 1 will make primary secondary size match. but I dont know why.

I modified wt's code, changed it from a dedicated evict mode into a lazy evict mode, that says,

dont evict it until it's dirty ratio reaches some valve. and primary secondary size matches.

So I believe the key point is the time dirty data resisdents in the cache. The longer the dirty page resisdents, the smaller disk space keeps.

 

But many situations can cause this. Such as: if primary and secondary runs on machines with the same cores/iops, primary is certainly more busy than secondary, so secondary evicts more frequently and have more fragmented pages, larger disk space.

Comment by Geert Bosch [ 23/Apr/19 ]

In some tests we did to investigate this ticket, we found that the secondary needed to evict 10x more modified pages than the primary, and there were many more page splits as result. Enabling read-concern majority generally results in documents staying in cache longer, which allows for more consolidation. So, the observed behavior is not entirely unexpected.

Comment by Danny Hatcher (Inactive) [ 22/Feb/19 ]

Size is about 25% larger on Secondaries for both 3.6.10 and 4.1.8 but 4.0.6 is significantly worse.

Comment by Danny Hatcher (Inactive) [ 22/Feb/19 ]

With Read Concern Majority disabled on 4.0.6:

Primary

{
	"ns" : "39656.testdata",
	"size" : 11210869987,
	"count" : 46253064,
	"avgObjSize" : 242,
	"storageSize" : 3740823552,
	"capped" : false,
	"wiredTiger" : {
		"metadata" : {
			"formatVersion" : 1
		},
		"creationString" : "access_pattern_hint=none,allocation_size=4KB,app_metadata=(formatVersion=1),assert=(commit_timestamp=none,read_timestamp=none),block_allocation=best,block_compressor=snappy,cache_resident=false,checksum=on,colgroups=,collator=,columns=,dictionary=0,encryption=(keyid=,name=),exclusive=false,extractor=,format=btree,huffman_key=,huffman_value=,ignore_in_memory_cache_size=false,immutable=false,internal_item_max=0,internal_key_max=0,internal_key_truncate=true,internal_page_max=4KB,key_format=q,key_gap=10,leaf_item_max=0,leaf_key_max=0,leaf_page_max=32KB,leaf_value_max=64MB,log=(enabled=false),lsm=(auto_throttle=true,bloom=true,bloom_bit_count=16,bloom_config=,bloom_hash_count=8,bloom_oldest=false,chunk_count_limit=0,chunk_max=5GB,chunk_size=10MB,merge_custom=(prefix=,start_generation=0,suffix=),merge_max=15,merge_min=0),memory_page_image_max=0,memory_page_max=10m,os_cache_dirty_max=0,os_cache_max=0,prefix_compression=false,prefix_compression_min=4,source=,split_deepen_min_child=0,split_deepen_per_child=0,split_pct=90,type=file,value_format=u",
		"type" : "file",
		"uri" : "statistics:table:collection-11--1554700681604833037",
		"LSM" : {
			"bloom filter false positives" : 0,
			"bloom filter hits" : 0,
			"bloom filter misses" : 0,
			"bloom filter pages evicted from cache" : 0,
			"bloom filter pages read into cache" : 0,
			"bloom filters in the LSM tree" : 0,
			"chunks in the LSM tree" : 0,
			"highest merge generation in the LSM tree" : 0,
			"queries that could have benefited from a Bloom filter that did not exist" : 0,
			"sleep for LSM checkpoint throttle" : 0,
			"sleep for LSM merge throttle" : 0,
			"total size of bloom filters" : 0
		},
		"block-manager" : {
			"allocations requiring file extension" : 423197,
			"blocks allocated" : 423729,
			"blocks freed" : 667,
			"checkpoint size" : 3740721152,
			"file allocation unit size" : 4096,
			"file bytes available for reuse" : 86016,
			"file magic number" : 120897,
			"file major version number" : 1,
			"file size in bytes" : 3740823552,
			"minor version number" : 0
		},
		"btree" : {
			"btree checkpoint generation" : 22,
			"column-store fixed-size leaf pages" : 0,
			"column-store internal pages" : 0,
			"column-store variable-size RLE encoded values" : 0,
			"column-store variable-size deleted values" : 0,
			"column-store variable-size leaf pages" : 0,
			"fixed-record size" : 0,
			"maximum internal page key size" : 368,
			"maximum internal page size" : 4096,
			"maximum leaf page key size" : 2867,
			"maximum leaf page size" : 32768,
			"maximum leaf page value size" : 67108864,
			"maximum tree depth" : 4,
			"number of key/value pairs" : 0,
			"overflow pages" : 0,
			"pages rewritten by compaction" : 0,
			"row-store internal pages" : 0,
			"row-store leaf pages" : 0
		},
		"cache" : {
			"bytes currently in the cache" : 306675981,
			"bytes dirty in the cache cumulative" : 48251342,
			"bytes read into cache" : 0,
			"bytes written from cache" : 11640745367,
			"checkpoint blocked page eviction" : 71,
			"data source pages selected for eviction unable to be evicted" : 3042,
			"eviction walk passes of a file" : 11141,
			"eviction walk target pages histogram - 0-9" : 2735,
			"eviction walk target pages histogram - 10-31" : 776,
			"eviction walk target pages histogram - 128 and higher" : 0,
			"eviction walk target pages histogram - 32-63" : 843,
			"eviction walk target pages histogram - 64-128" : 6787,
			"eviction walks abandoned" : 494,
			"eviction walks gave up because they restarted their walk twice" : 2,
			"eviction walks gave up because they saw too many pages and found no candidates" : 1732,
			"eviction walks gave up because they saw too many pages and found too few candidates" : 310,
			"eviction walks reached end of tree" : 2184,
			"eviction walks started from root of tree" : 2602,
			"eviction walks started from saved location in tree" : 8539,
			"hazard pointer blocked page eviction" : 162,
			"in-memory page passed criteria to be split" : 9241,
			"in-memory page splits" : 1556,
			"internal pages evicted" : 2605,
			"internal pages split during eviction" : 41,
			"leaf pages split during eviction" : 1961,
			"modified pages evicted" : 20387,
			"overflow pages read into cache" : 0,
			"page split during eviction deepened the tree" : 1,
			"page written requiring cache overflow records" : 0,
			"pages read into cache" : 0,
			"pages read into cache after truncate" : 1,
			"pages read into cache after truncate in prepare state" : 0,
			"pages read into cache requiring cache overflow entries" : 0,
			"pages requested from the cache" : 90939987,
			"pages seen by eviction walk" : 2127016,
			"pages written from cache" : 423617,
			"pages written requiring in-memory restoration" : 1360,
			"tracked dirty bytes in the cache" : 0,
			"unmodified pages evicted" : 406283
		},
		"cache_walk" : {
			"Average difference between current eviction generation when the page was last considered" : 0,
			"Average on-disk page image size seen" : 0,
			"Average time in cache for pages that have been visited by the eviction server" : 0,
			"Average time in cache for pages that have not been visited by the eviction server" : 0,
			"Clean pages currently in cache" : 0,
			"Current eviction generation" : 0,
			"Dirty pages currently in cache" : 0,
			"Entries in the root page" : 0,
			"Internal pages currently in cache" : 0,
			"Leaf pages currently in cache" : 0,
			"Maximum difference between current eviction generation when the page was last considered" : 0,
			"Maximum page size seen" : 0,
			"Minimum on-disk page image size seen" : 0,
			"Number of pages never visited by eviction server" : 0,
			"On-disk page image sizes smaller than a single allocation unit" : 0,
			"Pages created in memory and never written" : 0,
			"Pages currently queued for eviction" : 0,
			"Pages that could not be queued for eviction" : 0,
			"Refs skipped during cache traversal" : 0,
			"Size of the root page" : 0,
			"Total number of pages currently in cache" : 0
		},
		"compression" : {
			"compressed pages read" : 0,
			"compressed pages written" : 415206,
			"page written failed to compress" : 0,
			"page written was too small to compress" : 8457
		},
		"cursor" : {
			"bulk-loaded cursor-insert calls" : 0,
			"close calls that result in cache" : 0,
			"create calls" : 8,
			"cursor operation restarted" : 373259,
			"cursor-insert key and value bytes inserted" : 11347540863,
			"cursor-remove key bytes removed" : 0,
			"cursor-update value bytes updated" : 0,
			"cursors reused from cache" : 46226,
			"insert calls" : 45925838,
			"modify calls" : 0,
			"next calls" : 0,
			"open cursor count" : 0,
			"prev calls" : 1,
			"remove calls" : 0,
			"reserve calls" : 0,
			"reset calls" : 785977,
			"search calls" : 0,
			"search near calls" : 0,
			"truncate calls" : 0,
			"update calls" : 0
		},
		"reconciliation" : {
			"dictionary matches" : 0,
			"fast-path pages deleted" : 0,
			"internal page key bytes discarded using suffix compression" : 811628,
			"internal page multi-block writes" : 60,
			"internal-page overflow keys" : 0,
			"leaf page key bytes discarded using prefix compression" : 0,
			"leaf page multi-block writes" : 1961,
			"leaf-page overflow keys" : 0,
			"maximum blocks required for a page" : 1,
			"overflow values written" : 0,
			"page checksum matches" : 1116,
			"page reconciliation calls" : 22591,
			"page reconciliation calls for eviction" : 18130,
			"pages deleted" : 644
		},
		"session" : {
			"object compaction" : 0
		},
		"transaction" : {
			"update conflicts" : 0
		}
	},
	"nindexes" : 1,
	"totalIndexSize" : 498761728,
	"indexSizes" : {
		"_id_" : 498761728
	},
	"ok" : 1,
	"operationTime" : Timestamp(1550847463, 1),
	"$clusterTime" : {
		"clusterTime" : Timestamp(1550847463, 1),
		"signature" : {
			"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
			"keyId" : NumberLong(0)
		}
	}
}

Secondary

{
	"ns" : "39656.testdata",
	"size" : 11210869987,
	"count" : 46253064,
	"avgObjSize" : 242,
	"storageSize" : 9542180864,
	"capped" : false,
	"wiredTiger" : {
		"metadata" : {
			"formatVersion" : 1
		},
		"creationString" : "access_pattern_hint=none,allocation_size=4KB,app_metadata=(formatVersion=1),assert=(commit_timestamp=none,read_timestamp=none),block_allocation=best,block_compressor=snappy,cache_resident=false,checksum=on,colgroups=,collator=,columns=,dictionary=0,encryption=(keyid=,name=),exclusive=false,extractor=,format=btree,huffman_key=,huffman_value=,ignore_in_memory_cache_size=false,immutable=false,internal_item_max=0,internal_key_max=0,internal_key_truncate=true,internal_page_max=4KB,key_format=q,key_gap=10,leaf_item_max=0,leaf_key_max=0,leaf_page_max=32KB,leaf_value_max=64MB,log=(enabled=true),lsm=(auto_throttle=true,bloom=true,bloom_bit_count=16,bloom_config=,bloom_hash_count=8,bloom_oldest=false,chunk_count_limit=0,chunk_max=5GB,chunk_size=10MB,merge_custom=(prefix=,start_generation=0,suffix=),merge_max=15,merge_min=0),memory_page_image_max=0,memory_page_max=10m,os_cache_dirty_max=0,os_cache_max=0,prefix_compression=false,prefix_compression_min=4,source=,split_deepen_min_child=0,split_deepen_per_child=0,split_pct=90,type=file,value_format=u",
		"type" : "file",
		"uri" : "statistics:table:collection-21-6943134053790933930",
		"LSM" : {
			"bloom filter false positives" : 0,
			"bloom filter hits" : 0,
			"bloom filter misses" : 0,
			"bloom filter pages evicted from cache" : 0,
			"bloom filter pages read into cache" : 0,
			"bloom filters in the LSM tree" : 0,
			"chunks in the LSM tree" : 0,
			"highest merge generation in the LSM tree" : 0,
			"queries that could have benefited from a Bloom filter that did not exist" : 0,
			"sleep for LSM checkpoint throttle" : 0,
			"sleep for LSM merge throttle" : 0,
			"total size of bloom filters" : 0
		},
		"block-manager" : {
			"allocations requiring file extension" : 2126495,
			"blocks allocated" : 2131843,
			"blocks freed" : 2110,
			"checkpoint size" : 9542139904,
			"file allocation unit size" : 4096,
			"file bytes available for reuse" : 24576,
			"file magic number" : 120897,
			"file major version number" : 1,
			"file size in bytes" : 9542180864,
			"minor version number" : 0
		},
		"btree" : {
			"btree checkpoint generation" : 20,
			"column-store fixed-size leaf pages" : 0,
			"column-store internal pages" : 0,
			"column-store variable-size RLE encoded values" : 0,
			"column-store variable-size deleted values" : 0,
			"column-store variable-size leaf pages" : 0,
			"fixed-record size" : 0,
			"maximum internal page key size" : 368,
			"maximum internal page size" : 4096,
			"maximum leaf page key size" : 2867,
			"maximum leaf page size" : 32768,
			"maximum leaf page value size" : 67108864,
			"maximum tree depth" : 5,
			"number of key/value pairs" : 0,
			"overflow pages" : 0,
			"pages rewritten by compaction" : 0,
			"row-store internal pages" : 0,
			"row-store leaf pages" : 0
		},
		"cache" : {
			"bytes currently in the cache" : 518009274,
			"bytes dirty in the cache cumulative" : 2079197269,
			"bytes read into cache" : 0,
			"bytes written from cache" : 11778604164,
			"checkpoint blocked page eviction" : 2475,
			"data source pages selected for eviction unable to be evicted" : 11896,
			"eviction walk passes of a file" : 102304,
			"eviction walk target pages histogram - 0-9" : 4251,
			"eviction walk target pages histogram - 10-31" : 7069,
			"eviction walk target pages histogram - 128 and higher" : 0,
			"eviction walk target pages histogram - 32-63" : 11417,
			"eviction walk target pages histogram - 64-128" : 79567,
			"eviction walks abandoned" : 2,
			"eviction walks gave up because they restarted their walk twice" : 565,
			"eviction walks gave up because they saw too many pages and found no candidates" : 4129,
			"eviction walks gave up because they saw too many pages and found too few candidates" : 2311,
			"eviction walks reached end of tree" : 6791,
			"eviction walks started from root of tree" : 7018,
			"eviction walks started from saved location in tree" : 95286,
			"hazard pointer blocked page eviction" : 1180,
			"in-memory page passed criteria to be split" : 15710,
			"in-memory page splits" : 1747,
			"internal pages evicted" : 3572,
			"internal pages split during eviction" : 170,
			"leaf pages split during eviction" : 591916,
			"modified pages evicted" : 2084614,
			"overflow pages read into cache" : 0,
			"page split during eviction deepened the tree" : 2,
			"page written requiring cache overflow records" : 1199339,
			"pages read into cache" : 0,
			"pages read into cache after truncate" : 1,
			"pages read into cache after truncate in prepare state" : 0,
			"pages read into cache requiring cache overflow entries" : 0,
			"pages requested from the cache" : 104633691,
			"pages seen by eviction walk" : 50313831,
			"pages written from cache" : 2130686,
			"pages written requiring in-memory restoration" : 1825113,
			"tracked dirty bytes in the cache" : 0,
			"unmodified pages evicted" : 43774
		},
		"cache_walk" : {
			"Average difference between current eviction generation when the page was last considered" : 0,
			"Average on-disk page image size seen" : 0,
			"Average time in cache for pages that have been visited by the eviction server" : 0,
			"Average time in cache for pages that have not been visited by the eviction server" : 0,
			"Clean pages currently in cache" : 0,
			"Current eviction generation" : 0,
			"Dirty pages currently in cache" : 0,
			"Entries in the root page" : 0,
			"Internal pages currently in cache" : 0,
			"Leaf pages currently in cache" : 0,
			"Maximum difference between current eviction generation when the page was last considered" : 0,
			"Maximum page size seen" : 0,
			"Minimum on-disk page image size seen" : 0,
			"Number of pages never visited by eviction server" : 0,
			"On-disk page image sizes smaller than a single allocation unit" : 0,
			"Pages created in memory and never written" : 0,
			"Pages currently queued for eviction" : 0,
			"Pages that could not be queued for eviction" : 0,
			"Refs skipped during cache traversal" : 0,
			"Size of the root page" : 0,
			"Total number of pages currently in cache" : 0
		},
		"compression" : {
			"compressed pages read" : 0,
			"compressed pages written" : 1383374,
			"page written failed to compress" : 0,
			"page written was too small to compress" : 748194
		},
		"cursor" : {
			"bulk-loaded cursor-insert calls" : 0,
			"close calls that result in cache" : 0,
			"create calls" : 20,
			"cursor operation restarted" : 810782,
			"cursor-insert key and value bytes inserted" : 10795956416,
			"cursor-remove key bytes removed" : 0,
			"cursor-update value bytes updated" : 0,
			"cursors reused from cache" : 593176,
			"insert calls" : 43597430,
			"modify calls" : 0,
			"next calls" : 0,
			"open cursor count" : 0,
			"prev calls" : 1,
			"remove calls" : 0,
			"reserve calls" : 0,
			"reset calls" : 2286879,
			"search calls" : 0,
			"search near calls" : 0,
			"truncate calls" : 0,
			"update calls" : 0
		},
		"reconciliation" : {
			"dictionary matches" : 0,
			"fast-path pages deleted" : 0,
			"internal page key bytes discarded using suffix compression" : 1281607,
			"internal page multi-block writes" : 210,
			"internal-page overflow keys" : 0,
			"leaf page key bytes discarded using prefix compression" : 0,
			"leaf page multi-block writes" : 591849,
			"leaf-page overflow keys" : 0,
			"maximum blocks required for a page" : 1,
			"overflow values written" : 0,
			"page checksum matches" : 582,
			"page reconciliation calls" : 3322102,
			"page reconciliation calls for eviction" : 3245839,
			"pages deleted" : 1099
		},
		"session" : {
			"object compaction" : 0
		},
		"transaction" : {
			"update conflicts" : 0
		}
	},
	"nindexes" : 1,
	"totalIndexSize" : 3670056960,
	"indexSizes" : {
		"_id_" : 3670056960
	},
	"ok" : 1,
	"operationTime" : Timestamp(1550847533, 1),
	"$clusterTime" : {
		"clusterTime" : Timestamp(1550847533, 1),
		"signature" : {
			"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
			"keyId" : NumberLong(0)
		}
	}
}

Comment by Danny Hatcher (Inactive) [ 22/Feb/19 ]

Hello Patrik,

That was a good find and my own testing agrees; it appears that disabling Read Concern Majority on 4.0.6 is the trigger for the larger storage on the Secondary. I'm going to forward this ticket to the appropriate engineers to take a look and see if this is working as designed or a bug.

Thanks,

Danny

Comment by Patrik Susko [ 21/Feb/19 ]

Hello Danny,

I found something, thanks for the tip to check the config
Removing enableMajorityReadConcern: false from the config fixed the extreme index and storage size in 4.0.6

One question: Now that we get the same storageSize across versions, is the difference between primary and secondary 3,8GB vs 5GB expected? I would like to get equal storage sizes for each node. The difference adds up, especially with multiple collections.

 

I compared the wiredTiger.creationString, they are identical for each version.
Only 4.1.8 appears to have log=(enabled=false) as defaut, the other versions have log=(enabled=true)

The configs are identical except for path and port (configs.zip).
Removing cacheSizeGB had no effect.
Removing enableMajorityReadConcern: false --> affects storageSize on 4.0.6

storage:
    dbPath: "D:\\mongodb\\mongodb-win32-x86_64-2008plus-ssl-4.0.6\\1\\db"
    wiredTiger:
        engineConfig:
            cacheSizeGB: 5
systemLog:
    destination: file
    path: "D:\\mongodb\\mongodb-win32-x86_64-2008plus-ssl-4.0.6\\1\\log\\mongod.log"
    logAppend: true
    logRotate: reopen
    timeStampFormat: iso8601-utc
replication:
    replSetName: "sizeTesting"
    enableMajorityReadConcern: false
net:
    bindIp: 127.0.0.1
    port: 27021
security:
    keyFile: "D:\\mongodb\\mongodb-win32-x86_64-2008plus-ssl-4.0.6\\key.dat"
    authorization: "enabled"

Thanks,
Patrik

Comment by Danny Hatcher (Inactive) [ 21/Feb/19 ]

Hello Patrik,

I notice that your Primary has a setting of log=(enabled=false) for the WiredTiger engine configuration while your Secondary has log=(enabled=true). I believe the default should be false. Have you intentionally configured that value to say true? It is possible that the extra disk flushes being performed on the secondary are causing more space to be used. Please let me know the full configuration you are using for both nodes.

Thanks,

Danny

Comment by Patrik Susko [ 20/Feb/19 ]

Hello Danny,

the dataset I used for testing is uploaded (private information was already removed prior to testing).

The replicasets were specifically created for this test, and were only running a few minutes.
However, I've rerun the 4.0.6 import after restarting all nodes:

version storageSize PRIMARY storageSize SECONDARY
4.0.6 3821580288 9576587264
version totalIndexSize PRIMARY totalIndexSize SECONDARY
4.0.6 502243328 2467266560

4.0.6-2ndRun-primary.json
4.0.6-2ndRun-secondary.json

It would be interesting to see if you get ~3.8 GB or ~5 GB on both nodes with my dataset.

Thanks,

Patrik

Comment by Danny Hatcher (Inactive) [ 20/Feb/19 ]

Hello Patrik,

I've tried reproducing this issue by restoring a database with ~10GB of data into a 4.0.6 cluster but the storageSize is relatively equal across the Primary and Secondary. For the 4.0.6 test that you ran specifically, were both nodes restarted right before running the restore? Many of the statistics you collected are cumulative so will not be truly indicative of the issue if they had been running for a while before the test. If the servers had been running before the test, would it be possible to restart both then try the restore again?

Are you able to upload 5GB of the dataset you are using to our Secure Upload Portal? Only MongoDB employees will be able to access the files and they will automatically delete after a period of time. If there is any private information in the dataset, are you able to reproduce using a different dataset?

Thank you,

Danny

Generated at Thu Feb 08 04:52:43 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.