[SERVER-20961] Large amounts of create and drop collections can cause listDatabases to be slow under WiredTiger Created: 16/Oct/15  Updated: 28/Dec/15  Resolved: 20/Nov/15

Status: Closed
Project: Core Server
Component/s: WiredTiger
Affects Version/s: 3.0.6, 3.0.7, 3.2.0-rc0
Fix Version/s: 3.0.9, 3.2.0-rc4

Type: Bug Priority: Major - P3
Reporter: David Hows Assignee: David Hows
Resolution: Done Votes: 2
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: HTML File threads     HTML File threads2    
Issue Links:
Depends
depends on WT-2174 Enhance size statistics to not wait f... Closed
Related
is related to SERVER-20716 Slow listDatabases with WT Closed
is related to SERVER-20877 Under cache-full conditions serverSta... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Completed:
Steps To Reproduce:

require "mongo"
 
client1 = Mongo::Client.new('mongodb://127.0.0.1:27017/')
client2 = Mongo::Client.new('mongodb://127.0.0.1:27017/')
client3 = Mongo::Client.new('mongodb://127.0.0.1:27017/')
client4 = Mongo::Client.new('mongodb://127.0.0.1:27017/')
 
Mongo::Logger.logger.level = Logger::WARN
 
for x in 0..100
    thr = Thread.new {
#        sleep 1
        before = Time.now
        client1.use("admin").command(:listDatabases => 1)
        after= Time.now
        p "took: " + (after-before).to_s
    }
        thr1 = Thread.new {
        for i in 0..100
                name = "abcd" + ((x*100)+i).to_s;
                client2.use(name)[:foo].indexes.create_one({ :x => 1})
                client2.use(name)[:foo].insert_one({ :x => 1})
                client2.use(name)[:baz].indexes.create_one({ :x => 1})
                client2.use(name)[:bar].insert_one({ :x => 1})
        end
        }
        thr2 = Thread.new {
        for i in 0..100
                name = "abcd" + ((x*100)+i).to_s;
                client3.use(name)[:bar].indexes.create_one({ :x => 1})
                client3.use(name)[:bar].insert_one({ :x => 1})
                client3.use(name)[:qux].indexes.create_one({ :x => 1})
                client3.use(name)[:qux].insert_one({ :x => 1})
        end
        }
        thr3 = Thread.new {
        for i in 0..100
                name = "abcd" + ((x*100)+i).to_s;
                client4.use(name)[:alpha].indexes.create_one({ :x => 1})
                client4.use(name)[:alpha].insert_one({ :x => 1})
                client4.use(name)[:beta].indexes.create_one({ :x => 1})
                client4.use(name)[:beta].insert_one({ :x => 1})
        end
        }
        thr1.join
        thr2.join
        thr3.join
        thr4 = Thread.new {
        for i in 0..100
                name = "abcd" + ((x*100)+i).to_s;
                client2.use(name)[:foo].drop
                client3.use(name)[:baz].drop
                client4.use(name)[:beta].drop
        end
        }
        thr4.join
    sleep 20
    thr.join()
end

Participants:

 Description   

Under a workload with a high volume of collection (and index) creation over a number of database then the dropping of these collections will cause listDatabases command to run longer on a WT instance.

Using the reproduction script shows the following execution time for listDatabases per lap in seconds:

"took: 1.705329872"
"took: 1.328905619"
"took: 1.55636039"
"took: 1.221779232"
"took: 1.772605573"
"took: 1.689325783"
"took: 1.15700596"
"took: 1.470458173"
"took: 1.255596249"
"took: 1.455441504"
"took: 8.487761074"
"took: 13.784377947"
"took: 15.079964722"
"took: 13.892499541"
"took: 15.179368285"
"took: 14.683478172"
"took: 14.776398389"
"took: 14.481528086"
"took: 14.688469802"
"took: 13.738872847"
"took: 14.298077263"



 Comments   
Comment by Daniel Pasette (Inactive) [ 28/Dec/15 ]

Backport to 3.0 complete

commit 053f1442ac1c0d50dcad23c6528df43a75b2fcb6
Author: Dan Pasette <dan@mongodb.com>
Date: Wed Dec 23 18:26:28 2015 -0500

Import wiredtiger-wiredtiger-mongodb-3.0.7-22-ga1ddc5e.tar.gz from wiredtiger branch mongodb-3.0

Comment by David Hows [ 19/Nov/15 ]

I've re-tested on Master with the mentioned patches having been pulled in.

Results are the same as before, with nominally good response times and 1-2 occasions when we see a spike in response times due to contention for the Database lock being the cause of delays. I believe we can mark this as closed for 3.2 and work on backport to 3.0.

Script results:

"took: 0.014780024"
"took: 0.12277441"
"took: 0.119202708"
"took: 0.063026424"
"took: 0.066182342"
"took: 0.12123077"
"took: 0.110646549"
"took: 6.031858265"
"took: 0.130959283"
"took: 0.138787366"
"took: 0.106623675"
"took: 0.196402118"
"took: 0.182217924"
"took: 0.117814905"
"took: 0.275733352"
"took: 4.424616586"
"took: 0.362121995"
"took: 0.210682331"
"took: 0.204168197"
"took: 0.169111846"

Long Commands

2015-11-19T13:04:14.756+1100 I COMMAND  [conn9] command admin.$cmd command: listDatabases { listDatabases: 1 } ntoskip:0 keyUpdates:0 writeConflicts:0 numYields:0 reslen:39162 locks:{ Global: { acquireCount: { r: 1404 } }, Database: { acquireCount: { r: 702 }, acquireWaitCount: { r: 1 }, timeAcquiringMicros: { r: 5986833 } } } protocol:op_query 6025ms
2015-11-19T13:12:39.484+1100 I COMMAND  [conn9] command admin.$cmd command: listDatabases { listDatabases: 1 } ntoskip:0 keyUpdates:0 writeConflicts:0 numYields:0 reslen:84965 locks:{ Global: { acquireCount: { r: 3004 } }, Database: { acquireCount: { r: 1502 }, acquireWaitCount: { r: 1 }, timeAcquiringMicros: { r: 4331430 } } } protocol:op_query 4413ms

Comment by David Hows [ 18/Nov/15 ]

Looks much better.

Ran till the script overloaded my current ulimit settings (20000 files) due to the number of databases. Only a few sporadic long listDatabases

"took: 0.014449721"
"took: 0.084979322"
"took: 0.064145598"
"took: 0.059052183"
"took: 0.079239813"
"took: 0.047060186"
"took: 0.098715098"
"took: 0.092227032"
"took: 1.478325573"
"took: 0.144919313"
"took: 0.113473037"
"took: 0.179162595"
"took: 0.115907827"
"took: 0.101242269"
"took: 0.143895919"
"took: 7.261129736"
"took: 0.204391548"
"took: 0.199370495"
"took: 0.140885415"
"took: 0.139796296"
"took: 0.195629151"
"took: 0.324222308"

Looks like the long ones are due to a high timeAcquiringMicros on the database lock

2015-11-18T15:57:21.901+1100 I COMMAND  [conn9] command admin.$cmd command: listDatabases { listDatabases: 1 } ntoskip:0 keyUpdates:0 writeConflicts:0 numYields:0 reslen:44762 locks:{ Global: { acquireCount: { r: 1604 } }, Database: { acquireCount: { r: 802 }, acquireWaitCount: { r: 1 }, timeAcquiringMicros: { r: 1424824 } } } protocol:op_query 1470ms
2015-11-18T16:04:38.178+1100 I COMMAND  [conn9] command admin.$cmd command: listDatabases { listDatabases: 1 } ntoskip:0 keyUpdates:0 writeConflicts:0 numYields:0 reslen:84965 locks:{ Global: { acquireCount: { r: 3004 } }, Database: { acquireCount: { r: 1502 }, acquireWaitCount: { r: 1 }, timeAcquiringMicros: { r: 7162083 } } } protocol:op_query 7248ms

Tried to compare with MMAP of the same version, but got a crash in MMAP at low numbers of collections:

2015-11-18T16:20:52.867+1100 I INDEX    [conn11] allocating new ns file /data/data/db/abcd361.ns, filling with zeroes...
2015-11-18T16:20:52.870+1100 I STORAGE  [conn11] In File::write(), ::pwrite for '/data/data/db/abcd361.ns' tried to write 1048576 bytes but only wrote 815104 bytes, failing with errno:2 No such fi
le or directory
2015-11-18T16:20:52.870+1100 I -        [conn11] Assertion: 18826:failure writing file /data/data/db/abcd361.ns

Comment by Alexander Gorrod [ 16/Oct/15 ]

In the above three traces all threads want the table lock - taken via the WT_WITH_TABLE_LOCK call. The thread that is creating a collection holds the table list lock. Both other threads are waiting for the table list lock. The operation that wants to create an index has no choice but to wait for the table list lock.

The operation that wants to retrieve the sizes of the files that constitute a table on disk could potentially avoid getting the table list lock. WiredTiger recently made a change to support a "size only" mode for cursors that avoids needing the schema lock when retrieving statistics. We could update the implementation of size only cursors to avoid taking the table lock as well for tables with a simple structure.

Comment by Alexander Gorrod [ 16/Oct/15 ]

Thanks for uploading the stack traces david.hows. From the first set of traces I can see that one thread is creating an index:

#2  0x0000000001bbae6f in __wt_spin_lock (session=0x3cfd900, t=<optimized out>) at src/third_party/wiredtiger/src/include/mutex.i:159
#3  __wt_session_create (session=0x3cfd900, uri=0x1ce447d8 "table:index-10731-8723464202028817165", config=0xdc9a5b8 "type=file,internal_page_max=16k,leaf_page_max=16k,checksum=on,prefix_compression=true,block_compressor=,,,,key_format=u,value_format=u,app_metadata=(formatVersion=6,infoObj={ \"v\" : 1, \"key\" : { \"x\" : "...) at src/third_party/wiredtiger/src/session/session_api.c:395
#4  0x0000000001bbb0de in __session_create (wt_session=0x3cfd900, uri=<optimized out>, config=0xdc9a5b8 "type=file,internal_page_max=16k,leaf_page_max=16k,checksum=on,prefix_compression=true,block_compressor=,,,,key_format=u,value_format=u,app_metadata=(formatVersion=6,infoObj={ \"v\" : 1, \"key\" : { \"x\" : "...) at src/third_party/wiredtiger/src/session/session_api.c:442
#5  0x00000000011e6f23 in mongo::WiredTigerIndex::Create (txn=<optimized out>, uri="table:index-10731-8723464202028817165", config="type=file,internal_page_max=16k,leaf_page_max=16k,checksum=on,prefix_compression=true,block_compressor=,,,,key_format=u,value_format=u,app_metadata=(formatVersion=6,infoObj={ \"v\" : 1, \"key\" : { \"x\" : "...) at src/mongo/db/storage/wiredtiger/wiredtiger_index.cpp:210
#6  0x00000000011f32da in mongo::WiredTigerKVEngine::createSortedDataInterface (this=<optimized out>, opCtx=0x22c0cee0, ident=..., desc=<optimized out>) at src/mongo/db/storage/wiredtiger/wiredtiger_kv_engine.cpp:420
#7  0x0000000001126b15 in mongo::KVCollectionCatalogEntry::prepareForIndexBuild (this=0x1ce44f00, txn=0x22c0cee0, spec=0x2c7fdb80) at src/mongo/db/storage/kv/kv_collection_catalog_entry.cpp:133
#8  0x0000000000ad5eef in mongo::IndexCatalog::IndexBuildBlock::init (this=0x2b88b660) at src/mongo/db/catalog/index_catalog.cpp:360
#9  0x0000000000adcb83 in mongo::MultiIndexBlock::init (this=this@entry=0x7f945b8820a0, indexSpecs=std::vector of length 1, capacity 1 = {...}) at src/mongo/db/catalog/index_create.cpp:179
#10 0x0000000000b2616a in mongo::CmdCreateIndex::run (this=<optimized out>, txn=0x22c0cee0, dbname="abcd1251", cmdObj=..., options=<optimized out>, errmsg=..., result=...) at src/mongo/db/commands/create_indexes.cpp:224
#11 0x0000000000bfd220 in mongo::Command::run (this=this@entry=0x2106ba0 <mongo::cmdCreateIndex>, txn=txn@entry=0x22c0cee0, request=..., replyBuilder=replyBuilder@entry=0x7f945b882a50) at src/mongo/db/dbcommands.cpp:1388

Another thread is creating a collection:

#0  0x00007f946500706d in fdatasync () from /usr/lib/libc.so.6
#1  0x0000000001b8bfc2 in __wt_handle_sync (fd=14828) at src/third_party/wiredtiger/src/os_posix/os_fsync.c:44
#2  0x0000000001b8c3fa in __wt_fsync (session=session@entry=0x3cfdc00, fh=0x28424a20) at src/third_party/wiredtiger/src/os_posix/os_fsync.c:129
#3  0x0000000001b06195 in __wt_block_manager_create (session=session@entry=0x3cfdc00, filename=filename@entry=0x2aef1bd5 "collection-10730-8723464202028817165.wt", allocsize=4096) at src/third_party/wiredtiger/src/block/block_open.c:99
#4  0x0000000001ba7967 in __create_file (config=<optimized out>, exclusive=false, uri=0x2aef1bd0 "file:collection-10730-8723464202028817165.wt", session=0x3cfdc00) at src/third_party/wiredtiger/src/schema/schema_create.c:104
#5  __wt_schema_create (session=session@entry=0x3cfdc00, uri=<optimized out>, uri@entry=0x2aef1bd0 "file:collection-10730-8723464202028817165.wt", config=<optimized out>) at src/third_party/wiredtiger/src/schema/schema_create.c:691
#6  0x0000000001ba8595 in __create_colgroup (session=session@entry=0x3cfdc00, name=<optimized out>, exclusive=exclusive@entry=false, config=config@entry=0x28aaa418 "type=file,memory_page_max=10m,split_pct=90,leaf_value_max=64MB,checksum=on,block_compressor=snappy,,key_format=q,value_format=u,app_metadata=(formatVersion=1)") at src/third_party/wiredtiger/src/schema/schema_create.c:266
#7  0x0000000001ba7f67 in __create_table (config=0x28aaa418 "type=file,memory_page_max=10m,split_pct=90,leaf_value_max=64MB,checksum=on,block_compressor=snappy,,key_format=q,value_format=u,app_metadata=(formatVersion=1)", exclusive=false, name=0x2c9ceb38 "table:collection-10730-8723464202028817165", session=0x3cfdc00) at src/third_party/wiredtiger/src/schema/schema_create.c:616
#8  __wt_schema_create (session=session@entry=0x3cfdc00, uri=<optimized out>, uri@entry=0x2c9ceb38 "table:collection-10730-8723464202028817165", config=config@entry=0x28aaa418 "type=file,memory_page_max=10m,split_pct=90,leaf_value_max=64MB,checksum=on,block_compressor=snappy,,key_format=q,value_format=u,app_metadata=(formatVersion=1)") at src/third_party/wiredtiger/src/schema/schema_create.c:697
#9  0x0000000001bbaeab in __wt_session_create (session=0x3cfdc00, uri=0x2c9ceb38 "table:collection-10730-8723464202028817165", config=0x28aaa418 "type=file,memory_page_max=10m,split_pct=90,leaf_value_max=64MB,checksum=on,block_compressor=snappy,,key_format=q,value_format=u,app_metadata=(formatVersion=1)") at src/third_party/wiredtiger/src/session/session_api.c:395
#10 0x0000000001bbb0de in __session_create (wt_session=0x3cfdc00, uri=<optimized out>, config=0x28aaa418 "type=file,memory_page_max=10m,split_pct=90,leaf_value_max=64MB,checksum=on,block_compressor=snappy,,key_format=q,value_format=u,app_metadata=(formatVersion=1)") at src/third_party/wiredtiger/src/session/session_api.c:442
#11 0x00000000011f2c24 in mongo::WiredTigerKVEngine::createRecordStore (this=<optimized out>, opCtx=<optimized out>, ns=..., ident=..., options=...) at src/mongo/db/storage/wiredtiger/wiredtiger_kv_engine.cpp:363
#12 0x0000000001129003 in mongo::KVDatabaseCatalogEntry::createCollection (this=0xa3acfa0, txn=0x24b3a530, ns=..., options=..., allocateDefaultSpace=true) at src/mongo/db/storage/kv/kv_database_catalog_entry.cpp:213

Another thread is attempting to get size statistics for the listDatabases command:

#2  0x0000000001baa02b in __wt_spin_lock (session=0x3cfd300, t=<optimized out>) at src/third_party/wiredtiger/src/include/mutex.i:159
#3  __schema_add_table (tablep=<synthetic pointer>, ok_incomplete=false, namelen=30, name=0x2b8880c9 "index-23236-808695384733862891", session=0x3cfd300) at src/third_party/wiredtiger/src/schema/schema_list.c:26
#4  __wt_schema_get_table (session=session@entry=0x3cfd300, name=name@entry=0x2b8880c9 "index-23236-808695384733862891", namelen=30, ok_incomplete=ok_incomplete@entry=false, tablep=tablep@entry=0x7f945b47ddf8) at src/third_party/wiredtiger/src/schema/schema_list.c:98
#5  0x0000000001bb60e3 in __wt_curstat_table_init (session=session@entry=0x3cfd300, uri=uri@entry=0x2b8880c3 "table:index-23236-808695384733862891", cfg=0x22e01ca0, cst=0x2b870000) at src/third_party/wiredtiger/src/schema/schema_stat.c:72
#6  0x0000000001b6733e in __wt_curstat_init (session=session@entry=0x3cfd300, uri=<optimized out>, uri@entry=0x2b8880b8 "statistics:table:index-23236-808695384733862891", cfg=<optimized out>, cst=<optimized out>) at src/third_party/wiredtiger/src/cursor/cur_stat.c:459
#7  0x0000000001b67c2e in __wt_curstat_open (session=session@entry=0x3cfd300, uri=uri@entry=0x2b8880b8 "statistics:table:index-23236-808695384733862891", cfg=cfg@entry=0x7f945b47df70, cursorp=cursorp@entry=0x7f945b47df68) at src/third_party/wiredtiger/src/cursor/cur_stat.c:584
#8  0x0000000001bba035 in __wt_open_cursor (session=session@entry=0x3cfd300, uri=<optimized out>, uri@entry=0x2b8880b8 "statistics:table:index-23236-808695384733862891", owner=owner@entry=0x0, cfg=cfg@entry=0x7f945b47df70, cursorp=cursorp@entry=0x7f945b47df68) at src/third_party/wiredtiger/src/session/session_api.c:296
#9  0x0000000001bba5c7 in __session_open_cursor (wt_session=0x3cfd300, uri=0x2b8880b8 "statistics:table:index-23236-808695384733862891", to_dup=0x0, config=<optimized out>, cursorp=0x7f945b47e010) at src/third_party/wiredtiger/src/session/session_api.c:359
#10 0x0000000001213fa6 in mongo::WiredTigerUtil::getStatisticsValue (session=session@entry=0x3cfd300, uri="statistics:table:index-23236-808695384733862891", config="statistics=(size)", statisticsKey=statisticsKey@entry=2009) at src/mongo/db/storage/wiredtiger/wiredtiger_util.cpp:267
#11 0x0000000001214c5e in mongo::WiredTigerUtil::getStatisticsValueAs<long> (maximumResultType=9223372036854775807, statisticsKey=2009, config="statistics=(size)", uri="statistics:table:index-23236-808695384733862891", session=0x3cfd300) at src/mongo/db/storage/wiredtiger/wiredtiger_util.h:277
#12 mongo::WiredTigerUtil::getStatisticsValueAs<long> (statisticsKey=2009, config="statistics=(size)", uri="statistics:table:index-23236-808695384733862891", session=0x3cfd300) at src/mongo/db/storage/wiredtiger/wiredtiger_util.h:267
#13 mongo::WiredTigerUtil::getIdentSize (s=0x3cfd300, uri="table:index-23236-808695384733862891") at src/mongo/db/storage/wiredtiger/wiredtiger_util.cpp:299
#14 0x00000000011f1eb4 in mongo::WiredTigerKVEngine::getIdentSize (this=0x3b48000, opCtx=<optimized out>, ident=...) at src/mongo/db/storage/wiredtiger/wiredtiger_kv_engine.cpp:259
#15 0x00000000011276d7 in mongo::KVDatabaseCatalogEntry::sizeOnDisk (this=0xc921c70, opCtx=0x2bb27cd0) at src/mongo/db/storage/kv/kv_database_catalog_entry.cpp:151
#16 0x0000000000b675e3 in mongo::CmdListDatabases::run (this=<optimized out>, txn=<optimized out>, dbname=..., jsobj=..., errmsg=..., result=...) at src/mongo/db/commands/list_databases.cpp:107
#17 0x0000000000bfd220 in mongo::Command::run (this=this@entry=0x21076a0 <mongo::cmdListDatabases>, txn=txn@entry=0x2bb27cd0, request=..., replyBuilder=replyBuilder@entry=0x7f945b47f0f0) at src/mongo/db/dbcommands.cpp:1388

Comment by David Hows [ 16/Oct/15 ]

I've attached two thread traces from 3.2, showing all threads when running listDatabases

Comment by David Hows [ 16/Oct/15 ]

I've run this against 3.2RC0 as well, looks like it is also affected

"took: 4.637525754"
"took: 4.183376828"
"took: 8.257991639"
"took: 2.959129782"
"took: 4.421300037"

Generated at Thu Feb 08 03:55:48 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.