[SERVER-59729] Investigate to find the best default values for FCBIS parameters Created: 01/Sep/21  Updated: 29/Oct/23  Resolved: 04/Dec/21

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 5.2.0

Type: Task Priority: Major - P3
Reporter: Moustafa Maher Assignee: Matthew Russotto
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Backwards Compatibility: Fully Compatible
Sprint: Replication 2021-11-15, Replication 2021-11-29, Replication 2021-12-13
Participants:

 Description   

server parameters:
fileBasedInitialSyncMaxLagSec
fileBasedInitialSyncMaxCyclesWithoutProgress



 Comments   
Comment by Githook User [ 03/Dec/21 ]

Author:

{'name': 'Matthew Russotto', 'email': 'matthew.russotto@mongodb.com', 'username': 'mtrussotto'}

Message: SERVER-59729 Investigate to find the best default values for FCBIS parameters

We have a preliminary result of 69.3 GB (14.3MB/sec) in 4814 seconds for the initialsync logkeeper
workload. We also have a max insert performance on the order of 600KB/sec from our performance
testing.

These will vary considerably with hardware and load, but these seem like reasonable numbers to start
with. If we assume a large database (1000GB), we will transfer it in about 70000 seconds, in which
time we could have inserted another 42GB of data. We'll transfer that in about 3000 seconds,
inserting 1.8GB of data, which will transfer in 125 seconds (we're transferring only oplog journal
files, so no write amplification). So 3 cycles should bring us within a few minutes even under heavy
load. We can set fileBasedInitialSyncMaxLagSec to 300 (5 minutes) and
fileBasedInitialSyncMaxCyclesWithoutProgress to 3.

Setting fileBasedInitialSyncMaxLagSec much smaller likely doesn't help, because we cannot be doing
oplog application while downloading in FCBIS (once we write to the data store through WT we cannot
add log files); any time we spend in oplog application will have to be made up after the node begins
normal operation.
Branch: master
https://github.com/mongodb/mongo/commit/df86cb8567b38302ac1ec616566523c05182888c

Comment by Matthew Russotto [ 02/Dec/21 ]

We have a preliminary result of 69.3 GB (14.3MB/sec) in 4814 seconds for the initialsync logkeeper workload. We also have a max insert performance on the order of 600KB/sec from our performance testing (not far from a very recent doc here:
https://www.mongodb.com/developer/how-to/mongodb-network-compression/ )

These will vary considerably with hardware and load, but these seem like reasonable numbers to start with. If we assume a large database (1000GB), we will transfer it in about 70000 seconds, in which time we could have inserted another 42GB of data. We'll transfer that in about 3000 seconds, inserting 1.8GB of data, which will transfer in 125 seconds (we're transferring only oplog journal files, so no write amplification). So 3 cycles should bring us within a few minutes even under heavy load. We can set fileBasedInitialSyncMaxLagSec to 300 (5 minutes) and fileBasedInitialSyncMaxCyclesWithoutProgress to 3.

Setting fileBasedInitialSyncMaxLagSec much smaller likely doesn't help, because we cannot be doing oplog application while downloading in FCBIS (once we write to the data store through WT we cannot add log files); any time we spend in oplog application will have to be made up after the node begins normal operation.

Generated at Thu Feb 08 05:47:59 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.