[SERVER-20185] Scaling issue at high connection count with journal enabled under WiredTiger Created: 28/Aug/15  Updated: 11/Jan/16  Resolved: 16/Sep/15

Status: Closed
Project: Core Server
Component/s: WiredTiger
Affects Version/s: 3.0.6, 3.1.7
Fix Version/s: 3.1.8

Type: Bug Priority: Major - P3
Reporter: Bruce Lucas (Inactive) Assignee: Susan LoVerso
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: PNG File journal-scaling.png     PNG File new-journal.png    
Issue Links:
Depends
depends on WT-2031 Buffer log records in memory to impro... Closed
Related
related to SERVER-20409 Negative scaling with more than 10K c... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Participants:

 Description   
  • 6 cores, 64 GB memory (everything fits in cache)
  • test below issues 10 inserts/s per connection and ramps connections up to 10k connections for a total expected throughput of 100k inserts/s
  • measured max throughput at small connection count was 300k/s without journal, 200k/s with journal, so this test, with a maximum expected throughput of only 100k/s, does not tax total capacity of system but rather probes the effect of high connection counts at relatively low op rates per connection
  • 3.0.6 build used is actually 3.0.6 + fixes for SERVER-20091

  • expected scaling is achieve without journal (green)
  • under 3.0.6 with journal enabled only 25% of expected throughput is reached; this is consistent run to run (red)
  • in 3.1 50-75% of expected throughput is reached, but there is a striking run-to-run variability (yellow, blue, purple)

Repro code:

function conns() {
    return db.serverStatus().connections.current
}
 
function ops() {
    return db.serverStatus().opcounters.insert
}
 
function repro(threads_insert) {
 
    // run forever
    seconds = 10000
 
    // starting stats
    last_conns = curr_conns = start_conns = conns()
    last_time = new Date()
    last_ops = ops()
 
    // loop starting new connections
    while (curr_conns < start_conns+threads_insert) {
 
        // start 10 more insert threads with a random delay around 100ms (10 inserts/second/thread)
        res = benchStart({
            ops: [{
                op: "insert",
                ns: "test.c",
                doc: {},
                delay: NumberInt(100 + Math.random()*10-5)
            }]
            seconds: seconds,
            parallel: 10,
        })
 
        // 10 new connections every 100ms
        sleep(100)
 
        // print op rate vs connections
        curr_conns = conns()
        if (curr_conns-last_conns >= 100) {
            curr_time = new Date()
            curr_ops = ops()
            ops_per_sec = Math.round((curr_ops - last_ops) / ((curr_time - last_time) / 1000.0))
            avg_conns = (last_conns+curr_conns) / 2
            print('' + avg_conns + '\t' + ops_per_sec)
            last_time = curr_time
            last_ops = curr_ops
            last_conns = curr_conns
        }
    }
 
    // run forever
    sleep(seconds*1000)
}



 Comments   
Comment by Michael Cahill (Inactive) [ 16/Sep/15 ]

boylook this is one of the substantial changes in WiredTiger for MongoDB 3.2. There are no plans to backport it to 3.0: there are many code changes involved and they would likely destabilize the 3.0 branch.

Comment by hongyu.bi [ 16/Sep/15 ]

hi , if 3.0.x will get backport? cause we're using 3.0.6 on production
thanks

Comment by Daniel Pasette (Inactive) [ 16/Sep/15 ]

Closing as fixed. Any additional issues should be raised as separate issues.

Comment by Bruce Lucas (Inactive) [ 15/Sep/15 ]

I ran against a build I did this morning from MongoDB master, so not WT develop branch, nor --enable-diagnostic. I didn't encounter any functional problems, but this was just a performance test.

Comment by Susan LoVerso [ 15/Sep/15 ]

Thanks Bruce! What did you run with? The drop last week had a few bugs in it that are fixed. Are you running against our develop branch? Also are you running with --enable-diagnostic? I have some "force less common code path" for unbuffered writes in there in diagnostic mode.

Comment by Bruce Lucas (Inactive) [ 15/Sep/15 ]

New journal algorithm shows huge improvement, with nearly perfect scaling up to 10k connections. Thanks sue.loverso!

Above 10k connections there is some negative scaling, but performance with journal remains close to performance without journal, so from my perspective it appears the new journal algorithm fixes this issue. I'll spin off a separate ticket to investigate the negative scaling not related to the journal.

Comment by Michael Cahill (Inactive) [ 11/Sep/15 ]

sue.loverso, can you please try this workload with MongoDB 3.1.8 (once your changes for WT-2031 and later are merged)?

Comment by Daniel Pasette (Inactive) [ 28/Aug/15 ]

Parking with Bruce until WT-2031 is resolved and merged.

Generated at Thu Feb 08 03:53:26 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.