Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Done
Priority: Major - P3
Fix Version/s: 3.2.0-rc3
Affects Version/s: 3.0.5
Component/s: WiredTiger
Labels:
None

Backwards Compatibility:
Fully Compatible
Operating System:
ALL
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

Test that creates a 50k dirty tables causes a stall during the checkpoint for several minutes.

shell script that calls functions in test.js:

# create a bunch of dirty tables and keep them dirty
# first iteration through creates the tables
# subsequent iterations keep them dirty
# issue does not repro with a smaller number of iterations 
# maybe tables need to become dirty after checkpoint starts?
for i in $(seq 20); do
    (
        # insert on 20 threads in parallel into a total of 50k collections
        for t in $(seq 20); do
            mongo --eval "load('test.js'); insert($t, 2500)" &
        done
        wait
    )
done

# now try to insert some data to make stall evident
(
    for t in $(seq 20); do
        mongo --eval "load('test.js'); load()" &
    done
    wait
)

test.js:

function insert(t, n) {
    for (var i=0; i<n; i++) {
        c = db['c'+t+'.'+i]
        c.insert({})
        if (i%100==0)
            print(t,i)
    }
}

function load() {
    count = 1000000
    every = 10000
    for (var i=0; i<count; ) {
        var bulk = db.c.initializeUnorderedBulkOp();
        for (var j=0; j<every; j++, i++)
            bulk.insert({})
        bulk.execute();
        print(i)
    }
}

From A to B is first iteration through the loop; this goes slower than subsequent iterations because it is creating 50k collections.
From B to C is remaining 19 iterations of inserting one document into each of the 50k collections
At C we begin trying to insert a bunch of data, but it stalls for several minutes until D.
At D checkpoint ends and inserts become unblocked
Based on "cursor prev" calls I think the single-document inserts into the 50k collections from B to C are stalling during checkpoints; I suspect the stalls are not as long because in this phase there are fewer dirty tables at each checkpoint because the tables are only being dirtied as fast as they can be created, whereas from B to C the tables are already created so are being dirtied much faster, so the checkpoint where the lengthy stall occurs sees a lot more dirty tables.

This is the active thread during the stall:

Thread 31 (Thread 0x7f5949f32700 (LWP 5161)):
#0  __strcmp_ssse3 () at ../sysdeps/x86_64/multiarch/../strcmp.S:210
#1  0x000000000134759a in __wt_meta_track_find_handle (session=session@entry=0x279d8c0, name=0xa8b9c30 "file:collection-6000-3939942762316306140.wt",
    checkpoint=checkpoint@entry=0x5e630aa0 "WiredTigerCheckpoint.1") at src/third_party/wiredtiger/src/meta/meta_track.c:245
#2  0x000000000137994d in __wt_session_lock_checkpoint (session=session@entry=0x279d8c0, checkpoint=0x5e630aa0 "WiredTigerCheckpoint.1") at src/third_party/wiredtiger/src/session/session_dhandle.c:454
#3  0x0000000001382885 in __checkpoint_worker (is_checkpoint=1, cfg=<optimized out>, session=<optimized out>) at src/third_party/wiredtiger/src/txn/txn_ckpt.c:924
#4  __wt_checkpoint (session=<optimized out>, cfg=<optimized out>) at src/third_party/wiredtiger/src/txn/txn_ckpt.c:1127
#5  0x0000000001381bf7 in __checkpoint_apply (session=0x279d8c0, cfg=0x7f5949f31a30, op=0x13823c0 <__wt_checkpoint>) at src/third_party/wiredtiger/src/txn/txn_ckpt.c:184
#6  0x0000000001383785 in __wt_txn_checkpoint (session=session@entry=0x279d8c0, cfg=cfg@entry=0x7f5949f31a30) at src/third_party/wiredtiger/src/txn/txn_ckpt.c:501
#7  0x0000000001376966 in __session_checkpoint (wt_session=0x279d8c0, config=<optimized out>) at src/third_party/wiredtiger/src/session/session_api.c:919
#8  0x000000000130e38a in __ckpt_server (arg=0x279d8c0) at src/third_party/wiredtiger/src/conn/conn_ckpt.c:95
#9  0x00007f594e1fb182 in start_thread (arg=0x7f5949f32700) at pthread_create.c:312
#10 0x00007f594d2fc47d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

50k-stall.png
116 kB
Aug 14 2015 03:02:13 PM UTC

depends on

WT-147 Allow indices to be created and dropped on non-empty tables

Closed

is depended on by

WT-1973 MongoDB changes for WiredTiger 2.7.0

Closed

is related to

SERVER-21629 Improve performance with WT with more than 10K active tables

Backlog

Assignee:: Michael Cahill (Inactive)
Reporter:: Bruce Lucas (Inactive)
Participants:: Bruce Lucas, Daniel Pasette, Githook User, Michael Cahill
Votes:: 0 Vote for this issue
Watchers:: 10 Start watching this issue

Created:: Aug 14 2015 03:02:13 PM UTC
Updated:: Dec 07 2016 10:50:11 PM UTC
Resolved:: Nov 23 2015 04:40:03 PM UTC

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates