[SERVER-53667] High rate of journal flushes on secondary in 4.4 Created: 08/Jan/21  Updated: 19/Dec/22  Resolved: 13/Jun/22

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Bruce Lucas (Inactive) Assignee: Matthew Russotto
Resolution: Duplicate Votes: 1
Labels: perf-escapes
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: PNG File Screen Shot 2021-03-03 at 2.03.12 PM.png     PNG File Screen Shot 2021-03-03 at 2.03.34 PM.png     PNG File Screen Shot 2021-03-03 at 2.35.57 PM.png     PNG File Screen Shot 2021-03-03 at 2.36.01 PM.png     PNG File Screenshot 2021-04-20 at 10.14.07-1.png     PNG File Screenshot 2021-04-20 at 10.14.07.png     PNG File primary-asis.png     PNG File secondary_as_is.png    
Issue Links:
Depends
depends on SERVER-54938 Only flush journal once per batch on ... Closed
depends on SERVER-54939 Investigate secondary batching behavi... Closed
Documented
is documented by DOCS-14755 DOCS-14755 Investigate changes in SER... Backlog
Duplicate
duplicates SERVER-54939 Investigate secondary batching behavi... Closed
Related
related to SERVER-59776 50% regression in single multi-update Closed
is related to SERVER-25249 High I/O load in secondary nodes with... Closed
is related to SERVER-54938 Only flush journal once per batch on ... Closed
is related to SERVER-65723 Add tunable parameter to improve batc... Closed
Sprint: Execution Team 2021-01-25, Repl 2021-03-08, Repl 2021-03-22, Repl 2021-04-05, Repl 2022-03-21, Repl 2022-04-04, Repl 2022-04-18, Repl 2022-05-02, Repl 2022-05-16, Repl 2022-05-30, Repl 2022-06-13, Repl 2022-06-27
Participants:
Case:

 Description   

A simple workload with a single thread doing updates as fast as possible sees the following:

  • 4.2
    • primary
      • updates: 1200/s
      • log flushes: 300/s
    • secondary
      • batches: 280/s
      • log flushes: 300/s
  • 4.4
    • primary
      • updates: 1250/s
      • log flushes: 10/s; much improved in 4.4 due to PM-1274
    • seccondary
      • batches: 1200/s
      • log flushes: 2400/s; much worse in 4.4 than 4.2, and much higher than primary

This creates a much higher i/o requirement on the secondary in 4.4, and much higher on the secondary than the primary.

Two possible issues:

  • Do we need to flush more often on the secondary than the primary?
  • Do we need to do 2 flushes per batch?


 Comments   
Comment by Matthew Russotto [ 13/Jun/22 ]

The issue of doing two journal flushes per batch was fixed in SERVER-54938 (backported to 4.4.5). The reduced batch size on 4.4 is due to changes designed to improve read concern majority latency. We have added a tunable parameter in SERVER-65723 to allow increased batching at the cost of higher read concern majority latency.

Comment by Lingzhi Deng [ 08/Mar/21 ]

SERVER-54938 answered and fixed the second issue in the description (i.e. Do we need to do 2 flushes per batch?) and will be backported to 4.4. We still need more investigation in SERVER-54939 to answer the first question. Once we have an answer to that and if we have a fix, we will definitely fix that in 4.4+.

Comment by Eric Lafontaine [ 08/Mar/21 ]

Hi All,

Just wondering if this will be fixed in Mongo 4.x or only in Mongo 5.x as per the bug resolution ticket.  The difference is huge on our side if the fix is backward incompatible.

Regards,

Eric Lafontaine

Comment by Bruce Lucas (Inactive) [ 05/Mar/21 ]

jason.chan, it's not completely clear to me whether the increased number of batches is the real issue or whether changing the number of batches is the solution, so I'd like to keep this ticket open until we've investigated that question.

Comment by Jason Chan [ 04/Mar/21 ]

Filed SERVER-54938 to remove the call to triggerJournalFlush and SERVER-54939 to investigate the increased number of batches between 4.2 and 4.4.

Comment by Bruce Lucas (Inactive) [ 03/Mar/21 ]

Here's the simple workload that was used to generate the numbers above.

Setup:

function insert() {
    db.c.drop()
    for (var i = 0; i < 1000*1000; i+=1000) {
        many = []
        for (var j = 0; j < 1000; j++)
            many.push({_id: (i+j), x: 0})
        db.c.insertMany(many)
    }
}

Workload:

function update() {
    while (true) {
        for (var i = 0; i < 1000*1000; i++) {
            db.c.update({_id: 1}, {$inc: {x: 1}}})
        }
    }
}

 

Comment by Dianna Hohensee (Inactive) [ 27/Jan/21 ]

I'm passing this back to replication to determine how to reduce (or parallelize) journal flush calls.

Comment by Bruce Lucas (Inactive) [ 08/Jan/21 ]

I assigned this to replication based on evin.roesle's suggestion, but PM-1274 was done by the execution team, so not sure who should look into this.

Comment by Bruce Lucas (Inactive) [ 08/Jan/21 ]

I suspect this is a side effect of PM-1274: in 4.2 we journal before replicating, so batch rate and therefore flush rate on secondary is determined by flush rate on primary. But in 4.4 we replicate before journaling, so the batch rate is much higher, often equal to operation rate.

Generated at Thu Feb 08 05:31:33 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.