[SERVER-13712] Reduce peak disk usage of test suites Created: 24/Apr/14  Updated: 11/Jul/16  Resolved: 08/May/14

Status: Closed
Project: Core Server
Component/s: Testing Infrastructure
Affects Version/s: None
Fix Version/s: 2.6.2, 2.7.1

Type: Task Priority: Major - P3
Reporter: Matt Kangas Assignee: Matt Kangas
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File dropdb.diff     Text File patch_535f23ed3ff12251b2000002_dbroot_mb.txt     File s_pass.diff    
Issue Links:
Depends
Related
Backwards Compatibility: Fully Compatible
Backport Completed:
Sprint: Server 2.7.1
Participants:
Linked BF Score: 0

 Description   

We need to ensure that our test suites never consume more than approximately 10 GB of /data on our MCI buildvariants.

Currently MCI tasks are taking up to 33 GB of /data space. This prevents us from using ephemeral storage on some EC2 configurations

Biggest offenders seen so far (see MCI-1449)

Linux 64/Linux 64 debug:

  • max 33G: qa_repo_tests
  • max 21G: slow2
  • max 17G: noPassthroughWithMongod
  • max 16G: aggregation
  • max 14G: sharding
  • max 12G: durability
  • max 12G: sharding

Windows

  • max 33G: qa_repo_tests
  • max 32G: replicasets (Win 32 only)
  • max 31G: sharding (Win 32 only)
  • max 28G: slow2
  • max 27G: replication (Win 32 only)
  • max 26G: tool (Win 32 only)
  • max 26g: auth (Win 32)

The obvious offenders in sharding/replicasets/slow2 are suites that do not clean up test databases periodically. A simple fix may be to drop all databases every N tests, like smoke.py does.

Related: MCI-1276, MCI-1449



 Comments   
Comment by Githook User [ 08/May/14 ]

Author:

{u'username': u'kangas', u'name': u'Matt Kangas', u'email': u'matt.kangas@mongodb.com'}

Message: SERVER-13712 fix fedora8 builders (python 2.5.1)

(cherry picked from commit 630944421eebbbdbde600b6294e28ec992fc3fba)
Branch: v2.6
https://github.com/mongodb/mongo/commit/2ff4d735d3b9cc430c4a3af43afe4264a36bed3f

Comment by Githook User [ 08/May/14 ]

Author:

{u'username': u'kangas', u'name': u'Matt Kangas', u'email': u'matt.kangas@mongodb.com'}

Message: SERVER-13712 cleanbb should clean entire /data/db dir

Invoke cleanbb via function call, not subprocess.

clean_dbroot() now cleans entire /data/db if --with-cleanbb specified,
including periodic cleanups.

(cherry picked from commit ab47b0b217ab40971a928bbe3d98bd315bbba716)
Branch: v2.6
https://github.com/mongodb/mongo/commit/71534242a3dbf802bfb3728c1d0f74848509cca6

Comment by Githook User [ 08/May/14 ]

Author:

{u'username': u'kangas', u'name': u'Matt Kangas', u'email': u'matt.kangas@mongodb.com'}

Message: SERVER-13712 smoke.py --clean-every=N option

(cherry picked from commit 8cd645a8fa0cce6313c4a15e4f3ec4fd8ac2787e)
Branch: v2.6
https://github.com/mongodb/mongo/commit/b267b2824ea6043cd27c14cfef6ff8af6f93b861

Comment by Githook User [ 08/May/14 ]

Author:

{u'username': u'kangas', u'name': u'Matt Kangas', u'email': u'matt.kangas@mongodb.com'}

Message: SERVER-13712 fix fedora8 builders (python 2.5.1)
Branch: master
https://github.com/mongodb/mongo/commit/630944421eebbbdbde600b6294e28ec992fc3fba

Comment by Githook User [ 08/May/14 ]

Author:

{u'username': u'kangas', u'name': u'Matt Kangas', u'email': u'matt.kangas@mongodb.com'}

Message: SERVER-13712 cleanbb should clean entire /data/db dir

Invoke cleanbb via function call, not subprocess.

clean_dbroot() now cleans entire /data/db if --with-cleanbb specified,
including periodic cleanups.
Branch: master
https://github.com/mongodb/mongo/commit/ab47b0b217ab40971a928bbe3d98bd315bbba716

Comment by Githook User [ 08/May/14 ]

Author:

{u'username': u'kangas', u'name': u'Matt Kangas', u'email': u'matt.kangas@mongodb.com'}

Message: SERVER-13712 smoke.py --clean-every=N option
Branch: master
https://github.com/mongodb/mongo/commit/8cd645a8fa0cce6313c4a15e4f3ec4fd8ac2787e

Comment by Matt Kangas [ 30/Apr/14 ]

See attached file for data from my hacked patch build. Format is:

<growth_in_mb>   <suite>:<buildvariant>

Suites on linux-64 that consumed more than a gigabyte after completion (values in MB)

14363	slow2:linux-64
12312	qa_repo_tests:linux-64
9488	jsCore_small_oplog:linux-64
8586	noPassthroughWithMongod:linux-64
7762	replicasets:linux-64
7036	durability:linux-64
6694	replicasets_auth:linux-64
5277	sharding:linux-64
4649	replication_auth:linux-64
4646	replication:linux-64
4272	parallel_compatibility:linux-64
4270	parallel:linux-64
4155	qa_repo_multiversion_tests:linux-64
3523	multiversion:linux-64
3202	sharding_auth:linux-64
2608	noPassthrough:linux-64
2591	auth:linux-64
1677	disk:linux-64

Comment by Randolph Tan [ 25/Apr/14 ]

Can you try the dropdb.diff? This was actually the first one I tried, but dropDatabase actually deletes the datafiles, so it had an effect on the total runtime of the test, I believe this patch adds about 30sec more if I remember correctly.

Comment by Randolph Tan [ 24/Apr/14 ]

mpobrien Can you try applying the patch? It passes on my machine locally runs almost the same time as the normal test run.

Comment by Michael O'Brien [ 24/Apr/14 ]

renctan - if you want to test this with a patch, we can also extract disk usage info from your patch's system logs to assess how much of an effect it's made.

Generated at Thu Feb 08 03:32:38 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.