[SERVER-39681] The `storage_mobile_record_store_test` takes long time to run in debug mode Created: 20/Feb/19  Updated: 27/Oct/23  Resolved: 02/Apr/20

Status: Closed
Project: Core Server
Component/s: Storage
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Kaloian Manassiev Assignee: Backlog - Storage Execution Team
Resolution: Gone away Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Assigned Teams:
Storage Execution
Operating System: ALL
Participants:

 Description   

The storage index/record store unit-tests seem to take quite a bit of time, but `storage_mobile_record_store_test` is especially slow, sometimes taking almost 10 minutes on debug builds.

Building the server with this command line:

python buildscripts/scons.py --dbg=on --opt=on --ssl --link-model=dynamic --allocator=system CC=clang CXX=clang++ --icecream MONGO_VERSION=0.0.0 MONGO_GIT_HASH=unknown VARIANT_DIR=ninja build.ninja

Produces these run times, which are the highest of all the unit-tests that execute:

[executor:cpp_unit_test:job9] 2019-02-20T03:14:58.952+0000 storage_wiredtiger_recovery_unit_test ran in 50.17 seconds: no failures detected.
[executor:cpp_unit_test:job11] 2019-02-20T03:14:59.651+0000 storage_wiredtiger_kv_engine_test ran in 84.67 seconds: no failures detected.
[executor:cpp_unit_test:job3] 2019-02-20T03:16:12.909+0000 storage_wiredtiger_record_store_test ran in 138.57 seconds: no failures detected.
[executor:cpp_unit_test:job15] 2019-02-20T03:16:16.903+0000 storage_wiredtiger_prefixed_record_store_test ran in 129.53 seconds: no failures detected.
[executor:cpp_unit_test:job13] 2019-02-20T03:18:19.838+0000 storage_wiredtiger_index_test ran in 236.46 seconds: no failures detected.
[executor:cpp_unit_test:job2] 2019-02-20T03:18:21.346+0000 storage_wiredtiger_prefixed_index_test ran in 236.71 seconds: no failures detected.
[executor:cpp_unit_test:job18] 2019-02-20T03:22:37.859+0000 storage_mobile_record_store_test ran in 492.81 seconds: no failures detected.



 Comments   
Comment by Kaloian Manassiev [ 22/Feb/19 ]

I am able to reproduce it on my Windows box. Majority of the time is spent in this loop, which populates the record store.

The call stack where majority of the time is spent looks like this:

>	storage_mobile_record_store_test.exe!winSync(sqlite3_file * id=0x000002136872add8, int flags=2) Line 43466	C
 	storage_mobile_record_store_test.exe!sqlite3OsSync(sqlite3_file * id=0x000002136872add8, int flags=2) Line 22139	C
 	storage_mobile_record_store_test.exe!syncJournal(Pager * pPager=0x000002136872aba0, int newHdr=0) Line 54695	C
 	storage_mobile_record_store_test.exe!sqlite3PagerCommitPhaseOne(Pager * pPager=0x000002136872aba0, const char * zMaster=0x0000000000000000, int noSync=0) Line 56867	C
 	storage_mobile_record_store_test.exe!sqlite3BtreeCommitPhaseOne(Btree * p=0x000002136871aa40, const char * zMaster=0x0000000000000000) Line 66865	C
 	storage_mobile_record_store_test.exe!vdbeCommit(sqlite3 * db=0x0000021368740a60, Vdbe * p=0x000002136dc95b40) Line 78274	C
 	storage_mobile_record_store_test.exe!sqlite3VdbeHalt(Vdbe * p=0x000002136dc95b40) Line 78675	C
 	storage_mobile_record_store_test.exe!sqlite3VdbeExec(Vdbe * p=0x000002136dc95b40) Line 86154	C
 	storage_mobile_record_store_test.exe!sqlite3Step(Vdbe * p=0x000002136dc95b40) Line 81340	C
 	storage_mobile_record_store_test.exe!sqlite3_step(sqlite3_stmt * pStmt=0x000002136dc95b40) Line 81404	C
 	storage_mobile_record_store_test.exe!sqlite3_exec(sqlite3 * db=0x0000021368740a60, const char * zSql=0x000000d243af2128, int(*)(void *, int, char * *, char * *) xCallback=0x0000000000000000, void * pArg=0x0000000000000000, char * * pzErrMsg=0x000000d243af1dd8) Line 117983	C
 	storage_mobile_record_store_test.exe!mongo::SqliteStatement::execQuery(mongo::MobileSession * session=0x00000213687287c0, const std::basic_string<char,std::char_traits<char>,std::allocator<char> > & query={...}) Line 149	C++
 	storage_mobile_record_store_test.exe!mongo::MobileRecoveryUnit::_txnClose(bool commit=true) Line 213	C++
 	storage_mobile_record_store_test.exe!mongo::MobileRecoveryUnit::_commit() Line 69	C++
 	storage_mobile_record_store_test.exe!mongo::MobileRecoveryUnit::commitUnitOfWork() Line 118	C++
 	storage_mobile_record_store_test.exe!mongo::WriteUnitOfWork::commit() Line 110	C++
 	storage_mobile_record_store_test.exe!mongo::`anonymous namespace'::UnitTest__RecordStoreTestHarness__GetRandomIteratorNonEmpty::_doTest() Line 92	C++
 	storage_mobile_record_store_test.exe!mongo::unittest::Test::run() Line 218	C++

It looks like that winSync calls osFlushFileBuffers and I am wondering whether this is contributing to the long run time?

Comment by Eric Milkie [ 21/Feb/19 ]

I still can't repro the slowness; it also looks like the Evergreen builders do not take this long to run this test. If you could, an output of a profiler would be helpful to determine where your particular machine is having trouble.

Comment by Kaloian Manassiev [ 20/Feb/19 ]

I updated the description with the command line that I used to build the test.

Comment by Eric Milkie [ 20/Feb/19 ]

I ran it in debug mode on my machine and it takes 1.265 seconds. Can you help me reproduce your slowness?

Comment by Eric Milkie [ 20/Feb/19 ]

It runs in 0.692 seconds in release mode on my machine, so there must be something about the debug mode checking that is slowing it down.

Generated at Thu Feb 08 04:52:48 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.