Details
-
Improvement
-
Resolution: Done
-
Major - P3
-
None
-
None
-
None
-
Storage Execution
-
Fully Compatible
Description
See the attached image. An identical workload is run against two sets (test and control) with two nodes and an arbiter each. All hosts are configured with a 1GB WT cache. It starts by inserting one million documents in batches of 100 with a 900 byte random string. When this completes (indicated by a blue vertical line in the image), the secondary of the test set is killed, preventing the primary from advancing its commit point or deleting old snapshots. It will create new snapshots until the limit of 1000 uncommitted snapshots is hit. After the secondary is killed, the workload switches to updating documents for 20 minutes. The updates are done in batches of 1000 sequential documents.
The test set appears to use an unbounded amount of disk space and suffers from some extreme pauses. During some, but not all, of these pauses, the system seems to be completely idle with barely any CPU or disk utilization.
To confirm that the problem was not related to there being 1000 snapshots I limited the server to keeping 3 total snapshots by setting the uncommitted snapshot limit to 2 at https://github.com/mongodb/mongo/blob/r3.3.3/src/mongo/db/repl/oplog.cpp#L1100. This didn't seem to make much of a difference.
Also, moving the testSet.stop() line to above begineState('insert') will make the snapshots be of a empty collection, and all inserts will be after the snapshots. Even in this case, the disk usage seems to be unbounded.
Repro:
- Download the .js and .py files to a directory that contains a mongod binary
- If needed, install the python2 libs pymongo and matplotlib
- Launch a mongod on the default port (27017) for reporting and IPC
- Run mongo workload.js (This will launch the replica sets, run monitor.py, and do the workload)
- Once the workload starts run python plot.py (It will update as new data is collected)