Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-20961

Large amounts of create and drop collections can cause listDatabases to be slow under WiredTiger

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Fixed
    • Affects Version/s: 3.0.6, 3.0.7, 3.2.0-rc0
    • Fix Version/s: 3.0.9, 3.2.0-rc4
    • Component/s: WiredTiger
    • Labels:
      None
    • Backwards Compatibility:
      Fully Compatible
    • Operating System:
      ALL
    • Backport Completed:
    • Steps To Reproduce:
      Hide

      require "mongo"
       
      client1 = Mongo::Client.new('mongodb://127.0.0.1:27017/')
      client2 = Mongo::Client.new('mongodb://127.0.0.1:27017/')
      client3 = Mongo::Client.new('mongodb://127.0.0.1:27017/')
      client4 = Mongo::Client.new('mongodb://127.0.0.1:27017/')
       
      Mongo::Logger.logger.level = Logger::WARN
       
      for x in 0..100
          thr = Thread.new {
      #        sleep 1
              before = Time.now
              client1.use("admin").command(:listDatabases => 1)
              after= Time.now
              p "took: " + (after-before).to_s
          }
              thr1 = Thread.new {
              for i in 0..100
                      name = "abcd" + ((x*100)+i).to_s;
                      client2.use(name)[:foo].indexes.create_one({ :x => 1})
                      client2.use(name)[:foo].insert_one({ :x => 1})
                      client2.use(name)[:baz].indexes.create_one({ :x => 1})
                      client2.use(name)[:bar].insert_one({ :x => 1})
              end
              }
              thr2 = Thread.new {
              for i in 0..100
                      name = "abcd" + ((x*100)+i).to_s;
                      client3.use(name)[:bar].indexes.create_one({ :x => 1})
                      client3.use(name)[:bar].insert_one({ :x => 1})
                      client3.use(name)[:qux].indexes.create_one({ :x => 1})
                      client3.use(name)[:qux].insert_one({ :x => 1})
              end
              }
              thr3 = Thread.new {
              for i in 0..100
                      name = "abcd" + ((x*100)+i).to_s;
                      client4.use(name)[:alpha].indexes.create_one({ :x => 1})
                      client4.use(name)[:alpha].insert_one({ :x => 1})
                      client4.use(name)[:beta].indexes.create_one({ :x => 1})
                      client4.use(name)[:beta].insert_one({ :x => 1})
              end
              }
              thr1.join
              thr2.join
              thr3.join
              thr4 = Thread.new {
              for i in 0..100
                      name = "abcd" + ((x*100)+i).to_s;
                      client2.use(name)[:foo].drop
                      client3.use(name)[:baz].drop
                      client4.use(name)[:beta].drop
              end
              }
              thr4.join
          sleep 20
          thr.join()
      end
      

      Show
      require "mongo"   client1 = Mongo::Client. new ( 'mongodb://127.0.0.1:27017/' ) client2 = Mongo::Client. new ( 'mongodb://127.0.0.1:27017/' ) client3 = Mongo::Client. new ( 'mongodb://127.0.0.1:27017/' ) client4 = Mongo::Client. new ( 'mongodb://127.0.0.1:27017/' )   Mongo::Logger.logger.level = Logger:: WARN   for x in 0 .. 100 thr = Thread . new { # sleep 1 before = Time .now client1.use( "admin" ).command( :listDatabases => 1 ) after= Time .now p "took: " + (after-before).to_s } thr1 = Thread . new { for i in 0 .. 100 name = "abcd" + ((x* 100 )+i).to_s; client2.use(name)[ :foo ].indexes.create_one({ :x => 1 }) client2.use(name)[ :foo ].insert_one({ :x => 1 }) client2.use(name)[ :baz ].indexes.create_one({ :x => 1 }) client2.use(name)[ :bar ].insert_one({ :x => 1 }) end } thr2 = Thread . new { for i in 0 .. 100 name = "abcd" + ((x* 100 )+i).to_s; client3.use(name)[ :bar ].indexes.create_one({ :x => 1 }) client3.use(name)[ :bar ].insert_one({ :x => 1 }) client3.use(name)[ :qux ].indexes.create_one({ :x => 1 }) client3.use(name)[ :qux ].insert_one({ :x => 1 }) end } thr3 = Thread . new { for i in 0 .. 100 name = "abcd" + ((x* 100 )+i).to_s; client4.use(name)[ :alpha ].indexes.create_one({ :x => 1 }) client4.use(name)[ :alpha ].insert_one({ :x => 1 }) client4.use(name)[ :beta ].indexes.create_one({ :x => 1 }) client4.use(name)[ :beta ].insert_one({ :x => 1 }) end } thr1.join thr2.join thr3.join thr4 = Thread . new { for i in 0 .. 100 name = "abcd" + ((x* 100 )+i).to_s; client2.use(name)[ :foo ].drop client3.use(name)[ :baz ].drop client4.use(name)[ :beta ].drop end } thr4.join sleep 20 thr.join() end

      Description

      Under a workload with a high volume of collection (and index) creation over a number of database then the dropping of these collections will cause listDatabases command to run longer on a WT instance.

      Using the reproduction script shows the following execution time for listDatabases per lap in seconds:

      "took: 1.705329872"
      "took: 1.328905619"
      "took: 1.55636039"
      "took: 1.221779232"
      "took: 1.772605573"
      "took: 1.689325783"
      "took: 1.15700596"
      "took: 1.470458173"
      "took: 1.255596249"
      "took: 1.455441504"
      "took: 8.487761074"
      "took: 13.784377947"
      "took: 15.079964722"
      "took: 13.892499541"
      "took: 15.179368285"
      "took: 14.683478172"
      "took: 14.776398389"
      "took: 14.481528086"
      "took: 14.688469802"
      "took: 13.738872847"
      "took: 14.298077263"
      

      1. threads
        33 kB
        David Hows
      2. threads2
        33 kB
        David Hows

        Issue Links

          Activity

          Hide
          david.hows David Hows added a comment - - edited

          I've run this against 3.2RC0 as well, looks like it is also affected

          "took: 4.637525754"
          "took: 4.183376828"
          "took: 8.257991639"
          "took: 2.959129782"
          "took: 4.421300037"
          

          Show
          david.hows David Hows added a comment - - edited I've run this against 3.2RC0 as well, looks like it is also affected "took: 4.637525754" "took: 4.183376828" "took: 8.257991639" "took: 2.959129782" "took: 4.421300037"
          Hide
          david.hows David Hows added a comment -

          I've attached two thread traces from 3.2, showing all threads when running listDatabases

          Show
          david.hows David Hows added a comment - I've attached two thread traces from 3.2, showing all threads when running listDatabases
          Hide
          alexander.gorrod Alexander Gorrod added a comment -

          Thanks for uploading the stack traces David Hows. From the first set of traces I can see that one thread is creating an index:

          #2  0x0000000001bbae6f in __wt_spin_lock (session=0x3cfd900, t=<optimized out>) at src/third_party/wiredtiger/src/include/mutex.i:159
          #3  __wt_session_create (session=0x3cfd900, uri=0x1ce447d8 "table:index-10731-8723464202028817165", config=0xdc9a5b8 "type=file,internal_page_max=16k,leaf_page_max=16k,checksum=on,prefix_compression=true,block_compressor=,,,,key_format=u,value_format=u,app_metadata=(formatVersion=6,infoObj={ \"v\" : 1, \"key\" : { \"x\" : "...) at src/third_party/wiredtiger/src/session/session_api.c:395
          #4  0x0000000001bbb0de in __session_create (wt_session=0x3cfd900, uri=<optimized out>, config=0xdc9a5b8 "type=file,internal_page_max=16k,leaf_page_max=16k,checksum=on,prefix_compression=true,block_compressor=,,,,key_format=u,value_format=u,app_metadata=(formatVersion=6,infoObj={ \"v\" : 1, \"key\" : { \"x\" : "...) at src/third_party/wiredtiger/src/session/session_api.c:442
          #5  0x00000000011e6f23 in mongo::WiredTigerIndex::Create (txn=<optimized out>, uri="table:index-10731-8723464202028817165", config="type=file,internal_page_max=16k,leaf_page_max=16k,checksum=on,prefix_compression=true,block_compressor=,,,,key_format=u,value_format=u,app_metadata=(formatVersion=6,infoObj={ \"v\" : 1, \"key\" : { \"x\" : "...) at src/mongo/db/storage/wiredtiger/wiredtiger_index.cpp:210
          #6  0x00000000011f32da in mongo::WiredTigerKVEngine::createSortedDataInterface (this=<optimized out>, opCtx=0x22c0cee0, ident=..., desc=<optimized out>) at src/mongo/db/storage/wiredtiger/wiredtiger_kv_engine.cpp:420
          #7  0x0000000001126b15 in mongo::KVCollectionCatalogEntry::prepareForIndexBuild (this=0x1ce44f00, txn=0x22c0cee0, spec=0x2c7fdb80) at src/mongo/db/storage/kv/kv_collection_catalog_entry.cpp:133
          #8  0x0000000000ad5eef in mongo::IndexCatalog::IndexBuildBlock::init (this=0x2b88b660) at src/mongo/db/catalog/index_catalog.cpp:360
          #9  0x0000000000adcb83 in mongo::MultiIndexBlock::init (this=this@entry=0x7f945b8820a0, indexSpecs=std::vector of length 1, capacity 1 = {...}) at src/mongo/db/catalog/index_create.cpp:179
          #10 0x0000000000b2616a in mongo::CmdCreateIndex::run (this=<optimized out>, txn=0x22c0cee0, dbname="abcd1251", cmdObj=..., options=<optimized out>, errmsg=..., result=...) at src/mongo/db/commands/create_indexes.cpp:224
          #11 0x0000000000bfd220 in mongo::Command::run (this=this@entry=0x2106ba0 <mongo::cmdCreateIndex>, txn=txn@entry=0x22c0cee0, request=..., replyBuilder=replyBuilder@entry=0x7f945b882a50) at src/mongo/db/dbcommands.cpp:1388
          

          Another thread is creating a collection:

          #0  0x00007f946500706d in fdatasync () from /usr/lib/libc.so.6
          #1  0x0000000001b8bfc2 in __wt_handle_sync (fd=14828) at src/third_party/wiredtiger/src/os_posix/os_fsync.c:44
          #2  0x0000000001b8c3fa in __wt_fsync (session=session@entry=0x3cfdc00, fh=0x28424a20) at src/third_party/wiredtiger/src/os_posix/os_fsync.c:129
          #3  0x0000000001b06195 in __wt_block_manager_create (session=session@entry=0x3cfdc00, filename=filename@entry=0x2aef1bd5 "collection-10730-8723464202028817165.wt", allocsize=4096) at src/third_party/wiredtiger/src/block/block_open.c:99
          #4  0x0000000001ba7967 in __create_file (config=<optimized out>, exclusive=false, uri=0x2aef1bd0 "file:collection-10730-8723464202028817165.wt", session=0x3cfdc00) at src/third_party/wiredtiger/src/schema/schema_create.c:104
          #5  __wt_schema_create (session=session@entry=0x3cfdc00, uri=<optimized out>, uri@entry=0x2aef1bd0 "file:collection-10730-8723464202028817165.wt", config=<optimized out>) at src/third_party/wiredtiger/src/schema/schema_create.c:691
          #6  0x0000000001ba8595 in __create_colgroup (session=session@entry=0x3cfdc00, name=<optimized out>, exclusive=exclusive@entry=false, config=config@entry=0x28aaa418 "type=file,memory_page_max=10m,split_pct=90,leaf_value_max=64MB,checksum=on,block_compressor=snappy,,key_format=q,value_format=u,app_metadata=(formatVersion=1)") at src/third_party/wiredtiger/src/schema/schema_create.c:266
          #7  0x0000000001ba7f67 in __create_table (config=0x28aaa418 "type=file,memory_page_max=10m,split_pct=90,leaf_value_max=64MB,checksum=on,block_compressor=snappy,,key_format=q,value_format=u,app_metadata=(formatVersion=1)", exclusive=false, name=0x2c9ceb38 "table:collection-10730-8723464202028817165", session=0x3cfdc00) at src/third_party/wiredtiger/src/schema/schema_create.c:616
          #8  __wt_schema_create (session=session@entry=0x3cfdc00, uri=<optimized out>, uri@entry=0x2c9ceb38 "table:collection-10730-8723464202028817165", config=config@entry=0x28aaa418 "type=file,memory_page_max=10m,split_pct=90,leaf_value_max=64MB,checksum=on,block_compressor=snappy,,key_format=q,value_format=u,app_metadata=(formatVersion=1)") at src/third_party/wiredtiger/src/schema/schema_create.c:697
          #9  0x0000000001bbaeab in __wt_session_create (session=0x3cfdc00, uri=0x2c9ceb38 "table:collection-10730-8723464202028817165", config=0x28aaa418 "type=file,memory_page_max=10m,split_pct=90,leaf_value_max=64MB,checksum=on,block_compressor=snappy,,key_format=q,value_format=u,app_metadata=(formatVersion=1)") at src/third_party/wiredtiger/src/session/session_api.c:395
          #10 0x0000000001bbb0de in __session_create (wt_session=0x3cfdc00, uri=<optimized out>, config=0x28aaa418 "type=file,memory_page_max=10m,split_pct=90,leaf_value_max=64MB,checksum=on,block_compressor=snappy,,key_format=q,value_format=u,app_metadata=(formatVersion=1)") at src/third_party/wiredtiger/src/session/session_api.c:442
          #11 0x00000000011f2c24 in mongo::WiredTigerKVEngine::createRecordStore (this=<optimized out>, opCtx=<optimized out>, ns=..., ident=..., options=...) at src/mongo/db/storage/wiredtiger/wiredtiger_kv_engine.cpp:363
          #12 0x0000000001129003 in mongo::KVDatabaseCatalogEntry::createCollection (this=0xa3acfa0, txn=0x24b3a530, ns=..., options=..., allocateDefaultSpace=true) at src/mongo/db/storage/kv/kv_database_catalog_entry.cpp:213
          

          Another thread is attempting to get size statistics for the listDatabases command:

          #2  0x0000000001baa02b in __wt_spin_lock (session=0x3cfd300, t=<optimized out>) at src/third_party/wiredtiger/src/include/mutex.i:159
          #3  __schema_add_table (tablep=<synthetic pointer>, ok_incomplete=false, namelen=30, name=0x2b8880c9 "index-23236-808695384733862891", session=0x3cfd300) at src/third_party/wiredtiger/src/schema/schema_list.c:26
          #4  __wt_schema_get_table (session=session@entry=0x3cfd300, name=name@entry=0x2b8880c9 "index-23236-808695384733862891", namelen=30, ok_incomplete=ok_incomplete@entry=false, tablep=tablep@entry=0x7f945b47ddf8) at src/third_party/wiredtiger/src/schema/schema_list.c:98
          #5  0x0000000001bb60e3 in __wt_curstat_table_init (session=session@entry=0x3cfd300, uri=uri@entry=0x2b8880c3 "table:index-23236-808695384733862891", cfg=0x22e01ca0, cst=0x2b870000) at src/third_party/wiredtiger/src/schema/schema_stat.c:72
          #6  0x0000000001b6733e in __wt_curstat_init (session=session@entry=0x3cfd300, uri=<optimized out>, uri@entry=0x2b8880b8 "statistics:table:index-23236-808695384733862891", cfg=<optimized out>, cst=<optimized out>) at src/third_party/wiredtiger/src/cursor/cur_stat.c:459
          #7  0x0000000001b67c2e in __wt_curstat_open (session=session@entry=0x3cfd300, uri=uri@entry=0x2b8880b8 "statistics:table:index-23236-808695384733862891", cfg=cfg@entry=0x7f945b47df70, cursorp=cursorp@entry=0x7f945b47df68) at src/third_party/wiredtiger/src/cursor/cur_stat.c:584
          #8  0x0000000001bba035 in __wt_open_cursor (session=session@entry=0x3cfd300, uri=<optimized out>, uri@entry=0x2b8880b8 "statistics:table:index-23236-808695384733862891", owner=owner@entry=0x0, cfg=cfg@entry=0x7f945b47df70, cursorp=cursorp@entry=0x7f945b47df68) at src/third_party/wiredtiger/src/session/session_api.c:296
          #9  0x0000000001bba5c7 in __session_open_cursor (wt_session=0x3cfd300, uri=0x2b8880b8 "statistics:table:index-23236-808695384733862891", to_dup=0x0, config=<optimized out>, cursorp=0x7f945b47e010) at src/third_party/wiredtiger/src/session/session_api.c:359
          #10 0x0000000001213fa6 in mongo::WiredTigerUtil::getStatisticsValue (session=session@entry=0x3cfd300, uri="statistics:table:index-23236-808695384733862891", config="statistics=(size)", statisticsKey=statisticsKey@entry=2009) at src/mongo/db/storage/wiredtiger/wiredtiger_util.cpp:267
          #11 0x0000000001214c5e in mongo::WiredTigerUtil::getStatisticsValueAs<long> (maximumResultType=9223372036854775807, statisticsKey=2009, config="statistics=(size)", uri="statistics:table:index-23236-808695384733862891", session=0x3cfd300) at src/mongo/db/storage/wiredtiger/wiredtiger_util.h:277
          #12 mongo::WiredTigerUtil::getStatisticsValueAs<long> (statisticsKey=2009, config="statistics=(size)", uri="statistics:table:index-23236-808695384733862891", session=0x3cfd300) at src/mongo/db/storage/wiredtiger/wiredtiger_util.h:267
          #13 mongo::WiredTigerUtil::getIdentSize (s=0x3cfd300, uri="table:index-23236-808695384733862891") at src/mongo/db/storage/wiredtiger/wiredtiger_util.cpp:299
          #14 0x00000000011f1eb4 in mongo::WiredTigerKVEngine::getIdentSize (this=0x3b48000, opCtx=<optimized out>, ident=...) at src/mongo/db/storage/wiredtiger/wiredtiger_kv_engine.cpp:259
          #15 0x00000000011276d7 in mongo::KVDatabaseCatalogEntry::sizeOnDisk (this=0xc921c70, opCtx=0x2bb27cd0) at src/mongo/db/storage/kv/kv_database_catalog_entry.cpp:151
          #16 0x0000000000b675e3 in mongo::CmdListDatabases::run (this=<optimized out>, txn=<optimized out>, dbname=..., jsobj=..., errmsg=..., result=...) at src/mongo/db/commands/list_databases.cpp:107
          #17 0x0000000000bfd220 in mongo::Command::run (this=this@entry=0x21076a0 <mongo::cmdListDatabases>, txn=txn@entry=0x2bb27cd0, request=..., replyBuilder=replyBuilder@entry=0x7f945b47f0f0) at src/mongo/db/dbcommands.cpp:1388
          

          Show
          alexander.gorrod Alexander Gorrod added a comment - Thanks for uploading the stack traces David Hows . From the first set of traces I can see that one thread is creating an index: #2 0x0000000001bbae6f in __wt_spin_lock (session=0x3cfd900, t=<optimized out>) at src/third_party/wiredtiger/src/include/mutex.i:159 #3 __wt_session_create (session=0x3cfd900, uri=0x1ce447d8 "table:index-10731-8723464202028817165", config=0xdc9a5b8 "type=file,internal_page_max=16k,leaf_page_max=16k,checksum=on,prefix_compression=true,block_compressor=,,,,key_format=u,value_format=u,app_metadata=(formatVersion=6,infoObj={ \"v\" : 1, \"key\" : { \"x\" : "...) at src/third_party/wiredtiger/src/session/session_api.c:395 #4 0x0000000001bbb0de in __session_create (wt_session=0x3cfd900, uri=<optimized out>, config=0xdc9a5b8 "type=file,internal_page_max=16k,leaf_page_max=16k,checksum=on,prefix_compression=true,block_compressor=,,,,key_format=u,value_format=u,app_metadata=(formatVersion=6,infoObj={ \"v\" : 1, \"key\" : { \"x\" : "...) at src/third_party/wiredtiger/src/session/session_api.c:442 #5 0x00000000011e6f23 in mongo::WiredTigerIndex::Create (txn=<optimized out>, uri="table:index-10731-8723464202028817165", config="type=file,internal_page_max=16k,leaf_page_max=16k,checksum=on,prefix_compression=true,block_compressor=,,,,key_format=u,value_format=u,app_metadata=(formatVersion=6,infoObj={ \"v\" : 1, \"key\" : { \"x\" : "...) at src/mongo/db/storage/wiredtiger/wiredtiger_index.cpp:210 #6 0x00000000011f32da in mongo::WiredTigerKVEngine::createSortedDataInterface (this=<optimized out>, opCtx=0x22c0cee0, ident=..., desc=<optimized out>) at src/mongo/db/storage/wiredtiger/wiredtiger_kv_engine.cpp:420 #7 0x0000000001126b15 in mongo::KVCollectionCatalogEntry::prepareForIndexBuild (this=0x1ce44f00, txn=0x22c0cee0, spec=0x2c7fdb80) at src/mongo/db/storage/kv/kv_collection_catalog_entry.cpp:133 #8 0x0000000000ad5eef in mongo::IndexCatalog::IndexBuildBlock::init (this=0x2b88b660) at src/mongo/db/catalog/index_catalog.cpp:360 #9 0x0000000000adcb83 in mongo::MultiIndexBlock::init (this=this@entry=0x7f945b8820a0, indexSpecs=std::vector of length 1, capacity 1 = {...}) at src/mongo/db/catalog/index_create.cpp:179 #10 0x0000000000b2616a in mongo::CmdCreateIndex::run (this=<optimized out>, txn=0x22c0cee0, dbname="abcd1251", cmdObj=..., options=<optimized out>, errmsg=..., result=...) at src/mongo/db/commands/create_indexes.cpp:224 #11 0x0000000000bfd220 in mongo::Command::run (this=this@entry=0x2106ba0 <mongo::cmdCreateIndex>, txn=txn@entry=0x22c0cee0, request=..., replyBuilder=replyBuilder@entry=0x7f945b882a50) at src/mongo/db/dbcommands.cpp:1388 Another thread is creating a collection: #0 0x00007f946500706d in fdatasync () from /usr/lib/libc.so.6 #1 0x0000000001b8bfc2 in __wt_handle_sync (fd=14828) at src/third_party/wiredtiger/src/os_posix/os_fsync.c:44 #2 0x0000000001b8c3fa in __wt_fsync (session=session@entry=0x3cfdc00, fh=0x28424a20) at src/third_party/wiredtiger/src/os_posix/os_fsync.c:129 #3 0x0000000001b06195 in __wt_block_manager_create (session=session@entry=0x3cfdc00, filename=filename@entry=0x2aef1bd5 "collection-10730-8723464202028817165.wt", allocsize=4096) at src/third_party/wiredtiger/src/block/block_open.c:99 #4 0x0000000001ba7967 in __create_file (config=<optimized out>, exclusive=false, uri=0x2aef1bd0 "file:collection-10730-8723464202028817165.wt", session=0x3cfdc00) at src/third_party/wiredtiger/src/schema/schema_create.c:104 #5 __wt_schema_create (session=session@entry=0x3cfdc00, uri=<optimized out>, uri@entry=0x2aef1bd0 "file:collection-10730-8723464202028817165.wt", config=<optimized out>) at src/third_party/wiredtiger/src/schema/schema_create.c:691 #6 0x0000000001ba8595 in __create_colgroup (session=session@entry=0x3cfdc00, name=<optimized out>, exclusive=exclusive@entry=false, config=config@entry=0x28aaa418 "type=file,memory_page_max=10m,split_pct=90,leaf_value_max=64MB,checksum=on,block_compressor=snappy,,key_format=q,value_format=u,app_metadata=(formatVersion=1)") at src/third_party/wiredtiger/src/schema/schema_create.c:266 #7 0x0000000001ba7f67 in __create_table (config=0x28aaa418 "type=file,memory_page_max=10m,split_pct=90,leaf_value_max=64MB,checksum=on,block_compressor=snappy,,key_format=q,value_format=u,app_metadata=(formatVersion=1)", exclusive=false, name=0x2c9ceb38 "table:collection-10730-8723464202028817165", session=0x3cfdc00) at src/third_party/wiredtiger/src/schema/schema_create.c:616 #8 __wt_schema_create (session=session@entry=0x3cfdc00, uri=<optimized out>, uri@entry=0x2c9ceb38 "table:collection-10730-8723464202028817165", config=config@entry=0x28aaa418 "type=file,memory_page_max=10m,split_pct=90,leaf_value_max=64MB,checksum=on,block_compressor=snappy,,key_format=q,value_format=u,app_metadata=(formatVersion=1)") at src/third_party/wiredtiger/src/schema/schema_create.c:697 #9 0x0000000001bbaeab in __wt_session_create (session=0x3cfdc00, uri=0x2c9ceb38 "table:collection-10730-8723464202028817165", config=0x28aaa418 "type=file,memory_page_max=10m,split_pct=90,leaf_value_max=64MB,checksum=on,block_compressor=snappy,,key_format=q,value_format=u,app_metadata=(formatVersion=1)") at src/third_party/wiredtiger/src/session/session_api.c:395 #10 0x0000000001bbb0de in __session_create (wt_session=0x3cfdc00, uri=<optimized out>, config=0x28aaa418 "type=file,memory_page_max=10m,split_pct=90,leaf_value_max=64MB,checksum=on,block_compressor=snappy,,key_format=q,value_format=u,app_metadata=(formatVersion=1)") at src/third_party/wiredtiger/src/session/session_api.c:442 #11 0x00000000011f2c24 in mongo::WiredTigerKVEngine::createRecordStore (this=<optimized out>, opCtx=<optimized out>, ns=..., ident=..., options=...) at src/mongo/db/storage/wiredtiger/wiredtiger_kv_engine.cpp:363 #12 0x0000000001129003 in mongo::KVDatabaseCatalogEntry::createCollection (this=0xa3acfa0, txn=0x24b3a530, ns=..., options=..., allocateDefaultSpace=true) at src/mongo/db/storage/kv/kv_database_catalog_entry.cpp:213 Another thread is attempting to get size statistics for the listDatabases command: #2 0x0000000001baa02b in __wt_spin_lock (session=0x3cfd300, t=<optimized out>) at src/third_party/wiredtiger/src/include/mutex.i:159 #3 __schema_add_table (tablep=<synthetic pointer>, ok_incomplete=false, namelen=30, name=0x2b8880c9 "index-23236-808695384733862891", session=0x3cfd300) at src/third_party/wiredtiger/src/schema/schema_list.c:26 #4 __wt_schema_get_table (session=session@entry=0x3cfd300, name=name@entry=0x2b8880c9 "index-23236-808695384733862891", namelen=30, ok_incomplete=ok_incomplete@entry=false, tablep=tablep@entry=0x7f945b47ddf8) at src/third_party/wiredtiger/src/schema/schema_list.c:98 #5 0x0000000001bb60e3 in __wt_curstat_table_init (session=session@entry=0x3cfd300, uri=uri@entry=0x2b8880c3 "table:index-23236-808695384733862891", cfg=0x22e01ca0, cst=0x2b870000) at src/third_party/wiredtiger/src/schema/schema_stat.c:72 #6 0x0000000001b6733e in __wt_curstat_init (session=session@entry=0x3cfd300, uri=<optimized out>, uri@entry=0x2b8880b8 "statistics:table:index-23236-808695384733862891", cfg=<optimized out>, cst=<optimized out>) at src/third_party/wiredtiger/src/cursor/cur_stat.c:459 #7 0x0000000001b67c2e in __wt_curstat_open (session=session@entry=0x3cfd300, uri=uri@entry=0x2b8880b8 "statistics:table:index-23236-808695384733862891", cfg=cfg@entry=0x7f945b47df70, cursorp=cursorp@entry=0x7f945b47df68) at src/third_party/wiredtiger/src/cursor/cur_stat.c:584 #8 0x0000000001bba035 in __wt_open_cursor (session=session@entry=0x3cfd300, uri=<optimized out>, uri@entry=0x2b8880b8 "statistics:table:index-23236-808695384733862891", owner=owner@entry=0x0, cfg=cfg@entry=0x7f945b47df70, cursorp=cursorp@entry=0x7f945b47df68) at src/third_party/wiredtiger/src/session/session_api.c:296 #9 0x0000000001bba5c7 in __session_open_cursor (wt_session=0x3cfd300, uri=0x2b8880b8 "statistics:table:index-23236-808695384733862891", to_dup=0x0, config=<optimized out>, cursorp=0x7f945b47e010) at src/third_party/wiredtiger/src/session/session_api.c:359 #10 0x0000000001213fa6 in mongo::WiredTigerUtil::getStatisticsValue (session=session@entry=0x3cfd300, uri="statistics:table:index-23236-808695384733862891", config="statistics=(size)", statisticsKey=statisticsKey@entry=2009) at src/mongo/db/storage/wiredtiger/wiredtiger_util.cpp:267 #11 0x0000000001214c5e in mongo::WiredTigerUtil::getStatisticsValueAs<long> (maximumResultType=9223372036854775807, statisticsKey=2009, config="statistics=(size)", uri="statistics:table:index-23236-808695384733862891", session=0x3cfd300) at src/mongo/db/storage/wiredtiger/wiredtiger_util.h:277 #12 mongo::WiredTigerUtil::getStatisticsValueAs<long> (statisticsKey=2009, config="statistics=(size)", uri="statistics:table:index-23236-808695384733862891", session=0x3cfd300) at src/mongo/db/storage/wiredtiger/wiredtiger_util.h:267 #13 mongo::WiredTigerUtil::getIdentSize (s=0x3cfd300, uri="table:index-23236-808695384733862891") at src/mongo/db/storage/wiredtiger/wiredtiger_util.cpp:299 #14 0x00000000011f1eb4 in mongo::WiredTigerKVEngine::getIdentSize (this=0x3b48000, opCtx=<optimized out>, ident=...) at src/mongo/db/storage/wiredtiger/wiredtiger_kv_engine.cpp:259 #15 0x00000000011276d7 in mongo::KVDatabaseCatalogEntry::sizeOnDisk (this=0xc921c70, opCtx=0x2bb27cd0) at src/mongo/db/storage/kv/kv_database_catalog_entry.cpp:151 #16 0x0000000000b675e3 in mongo::CmdListDatabases::run (this=<optimized out>, txn=<optimized out>, dbname=..., jsobj=..., errmsg=..., result=...) at src/mongo/db/commands/list_databases.cpp:107 #17 0x0000000000bfd220 in mongo::Command::run (this=this@entry=0x21076a0 <mongo::cmdListDatabases>, txn=txn@entry=0x2bb27cd0, request=..., replyBuilder=replyBuilder@entry=0x7f945b47f0f0) at src/mongo/db/dbcommands.cpp:1388
          Hide
          alexander.gorrod Alexander Gorrod added a comment -

          In the above three traces all threads want the table lock - taken via the WT_WITH_TABLE_LOCK call. The thread that is creating a collection holds the table list lock. Both other threads are waiting for the table list lock. The operation that wants to create an index has no choice but to wait for the table list lock.

          The operation that wants to retrieve the sizes of the files that constitute a table on disk could potentially avoid getting the table list lock. WiredTiger recently made a change to support a "size only" mode for cursors that avoids needing the schema lock when retrieving statistics. We could update the implementation of size only cursors to avoid taking the table lock as well for tables with a simple structure.

          Show
          alexander.gorrod Alexander Gorrod added a comment - In the above three traces all threads want the table lock - taken via the WT_WITH_TABLE_LOCK call. The thread that is creating a collection holds the table list lock. Both other threads are waiting for the table list lock. The operation that wants to create an index has no choice but to wait for the table list lock. The operation that wants to retrieve the sizes of the files that constitute a table on disk could potentially avoid getting the table list lock. WiredTiger recently made a change to support a "size only" mode for cursors that avoids needing the schema lock when retrieving statistics. We could update the implementation of size only cursors to avoid taking the table lock as well for tables with a simple structure.
          Hide
          david.hows David Hows added a comment -

          Looks much better.

          Ran till the script overloaded my current ulimit settings (20000 files) due to the number of databases. Only a few sporadic long listDatabases

          "took: 0.014449721"
          "took: 0.084979322"
          "took: 0.064145598"
          "took: 0.059052183"
          "took: 0.079239813"
          "took: 0.047060186"
          "took: 0.098715098"
          "took: 0.092227032"
          "took: 1.478325573"
          "took: 0.144919313"
          "took: 0.113473037"
          "took: 0.179162595"
          "took: 0.115907827"
          "took: 0.101242269"
          "took: 0.143895919"
          "took: 7.261129736"
          "took: 0.204391548"
          "took: 0.199370495"
          "took: 0.140885415"
          "took: 0.139796296"
          "took: 0.195629151"
          "took: 0.324222308"
          

          Looks like the long ones are due to a high timeAcquiringMicros on the database lock

          2015-11-18T15:57:21.901+1100 I COMMAND  [conn9] command admin.$cmd command: listDatabases { listDatabases: 1 } ntoskip:0 keyUpdates:0 writeConflicts:0 numYields:0 reslen:44762 locks:{ Global: { acquireCount: { r: 1604 } }, Database: { acquireCount: { r: 802 }, acquireWaitCount: { r: 1 }, timeAcquiringMicros: { r: 1424824 } } } protocol:op_query 1470ms
          2015-11-18T16:04:38.178+1100 I COMMAND  [conn9] command admin.$cmd command: listDatabases { listDatabases: 1 } ntoskip:0 keyUpdates:0 writeConflicts:0 numYields:0 reslen:84965 locks:{ Global: { acquireCount: { r: 3004 } }, Database: { acquireCount: { r: 1502 }, acquireWaitCount: { r: 1 }, timeAcquiringMicros: { r: 7162083 } } } protocol:op_query 7248ms
          

          Tried to compare with MMAP of the same version, but got a crash in MMAP at low numbers of collections:

          2015-11-18T16:20:52.867+1100 I INDEX    [conn11] allocating new ns file /data/data/db/abcd361.ns, filling with zeroes...
          2015-11-18T16:20:52.870+1100 I STORAGE  [conn11] In File::write(), ::pwrite for '/data/data/db/abcd361.ns' tried to write 1048576 bytes but only wrote 815104 bytes, failing with errno:2 No such fi
          le or directory
          2015-11-18T16:20:52.870+1100 I -        [conn11] Assertion: 18826:failure writing file /data/data/db/abcd361.ns
          

          Show
          david.hows David Hows added a comment - Looks much better. Ran till the script overloaded my current ulimit settings (20000 files) due to the number of databases. Only a few sporadic long listDatabases "took: 0.014449721" "took: 0.084979322" "took: 0.064145598" "took: 0.059052183" "took: 0.079239813" "took: 0.047060186" "took: 0.098715098" "took: 0.092227032" "took: 1.478325573" "took: 0.144919313" "took: 0.113473037" "took: 0.179162595" "took: 0.115907827" "took: 0.101242269" "took: 0.143895919" "took: 7.261129736" "took: 0.204391548" "took: 0.199370495" "took: 0.140885415" "took: 0.139796296" "took: 0.195629151" "took: 0.324222308" Looks like the long ones are due to a high timeAcquiringMicros on the database lock 2015-11-18T15:57:21.901+1100 I COMMAND [conn9] command admin.$cmd command: listDatabases { listDatabases: 1 } ntoskip:0 keyUpdates:0 writeConflicts:0 numYields:0 reslen:44762 locks:{ Global: { acquireCount: { r: 1604 } }, Database: { acquireCount: { r: 802 }, acquireWaitCount: { r: 1 }, timeAcquiringMicros: { r: 1424824 } } } protocol:op_query 1470ms 2015-11-18T16:04:38.178+1100 I COMMAND [conn9] command admin.$cmd command: listDatabases { listDatabases: 1 } ntoskip:0 keyUpdates:0 writeConflicts:0 numYields:0 reslen:84965 locks:{ Global: { acquireCount: { r: 3004 } }, Database: { acquireCount: { r: 1502 }, acquireWaitCount: { r: 1 }, timeAcquiringMicros: { r: 7162083 } } } protocol:op_query 7248ms Tried to compare with MMAP of the same version, but got a crash in MMAP at low numbers of collections: 2015-11-18T16:20:52.867+1100 I INDEX [conn11] allocating new ns file /data/data/db/abcd361.ns, filling with zeroes... 2015-11-18T16:20:52.870+1100 I STORAGE [conn11] In File::write(), ::pwrite for '/data/data/db/abcd361.ns' tried to write 1048576 bytes but only wrote 815104 bytes, failing with errno:2 No such fi le or directory 2015-11-18T16:20:52.870+1100 I - [conn11] Assertion: 18826:failure writing file /data/data/db/abcd361.ns
          Hide
          david.hows David Hows added a comment -

          I've re-tested on Master with the mentioned patches having been pulled in.

          Results are the same as before, with nominally good response times and 1-2 occasions when we see a spike in response times due to contention for the Database lock being the cause of delays. I believe we can mark this as closed for 3.2 and work on backport to 3.0.

          Script results:

          "took: 0.014780024"
          "took: 0.12277441"
          "took: 0.119202708"
          "took: 0.063026424"
          "took: 0.066182342"
          "took: 0.12123077"
          "took: 0.110646549"
          "took: 6.031858265"
          "took: 0.130959283"
          "took: 0.138787366"
          "took: 0.106623675"
          "took: 0.196402118"
          "took: 0.182217924"
          "took: 0.117814905"
          "took: 0.275733352"
          "took: 4.424616586"
          "took: 0.362121995"
          "took: 0.210682331"
          "took: 0.204168197"
          "took: 0.169111846"
          

          Long Commands

          2015-11-19T13:04:14.756+1100 I COMMAND  [conn9] command admin.$cmd command: listDatabases { listDatabases: 1 } ntoskip:0 keyUpdates:0 writeConflicts:0 numYields:0 reslen:39162 locks:{ Global: { acquireCount: { r: 1404 } }, Database: { acquireCount: { r: 702 }, acquireWaitCount: { r: 1 }, timeAcquiringMicros: { r: 5986833 } } } protocol:op_query 6025ms
          2015-11-19T13:12:39.484+1100 I COMMAND  [conn9] command admin.$cmd command: listDatabases { listDatabases: 1 } ntoskip:0 keyUpdates:0 writeConflicts:0 numYields:0 reslen:84965 locks:{ Global: { acquireCount: { r: 3004 } }, Database: { acquireCount: { r: 1502 }, acquireWaitCount: { r: 1 }, timeAcquiringMicros: { r: 4331430 } } } protocol:op_query 4413ms
          

          Show
          david.hows David Hows added a comment - I've re-tested on Master with the mentioned patches having been pulled in. Results are the same as before, with nominally good response times and 1-2 occasions when we see a spike in response times due to contention for the Database lock being the cause of delays. I believe we can mark this as closed for 3.2 and work on backport to 3.0. Script results: "took: 0.014780024" "took: 0.12277441" "took: 0.119202708" "took: 0.063026424" "took: 0.066182342" "took: 0.12123077" "took: 0.110646549" "took: 6.031858265" "took: 0.130959283" "took: 0.138787366" "took: 0.106623675" "took: 0.196402118" "took: 0.182217924" "took: 0.117814905" "took: 0.275733352" "took: 4.424616586" "took: 0.362121995" "took: 0.210682331" "took: 0.204168197" "took: 0.169111846" Long Commands 2015-11-19T13:04:14.756+1100 I COMMAND [conn9] command admin.$cmd command: listDatabases { listDatabases: 1 } ntoskip:0 keyUpdates:0 writeConflicts:0 numYields:0 reslen:39162 locks:{ Global: { acquireCount: { r: 1404 } }, Database: { acquireCount: { r: 702 }, acquireWaitCount: { r: 1 }, timeAcquiringMicros: { r: 5986833 } } } protocol:op_query 6025ms 2015-11-19T13:12:39.484+1100 I COMMAND [conn9] command admin.$cmd command: listDatabases { listDatabases: 1 } ntoskip:0 keyUpdates:0 writeConflicts:0 numYields:0 reslen:84965 locks:{ Global: { acquireCount: { r: 3004 } }, Database: { acquireCount: { r: 1502 }, acquireWaitCount: { r: 1 }, timeAcquiringMicros: { r: 4331430 } } } protocol:op_query 4413ms
          Hide
          pasette Dan Pasette added a comment -

          Backport to 3.0 complete

          commit 053f1442ac1c0d50dcad23c6528df43a75b2fcb6
          Author: Dan Pasette <dan@mongodb.com>
          Date: Wed Dec 23 18:26:28 2015 -0500

          Import wiredtiger-wiredtiger-mongodb-3.0.7-22-ga1ddc5e.tar.gz from wiredtiger branch mongodb-3.0

          Show
          pasette Dan Pasette added a comment - Backport to 3.0 complete commit 053f1442ac1c0d50dcad23c6528df43a75b2fcb6 Author: Dan Pasette <dan@mongodb.com> Date: Wed Dec 23 18:26:28 2015 -0500 Import wiredtiger-wiredtiger-mongodb-3.0.7-22-ga1ddc5e.tar.gz from wiredtiger branch mongodb-3.0

            People

            • Votes:
              2 Vote for this issue
              Watchers:
              22 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: