[SERVER-989] High heap usage with lots of collections Created: 08/Apr/10  Updated: 21/Oct/11  Resolved: 21/Oct/11

Status: Closed
Project: Core Server
Component/s: Performance
Affects Version/s: 1.4.0
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Michael Schurter Assignee: Eliot Horowitz (Inactive)
Resolution: Cannot Reproduce Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Ubuntu 9.10 64bit


Participants:

 Description   

I ran into very high RSS memory usage (mstearn thought the heap stat looked quite high) when testing mongod with a high number of collections (80k with only the automatic _id index per collection). Dropping some collections (~100) temporarily dropped heap usage, but it raised higher than before once dropping script stopped.

Steps to reproduce:

Start mongod with:

mongod --dbpath=tmpmongo --port=27018 -v --nssize=2047

Run the following code (or similar in the language of your choice):

import pymongo
conn = pymongo.Connection(port=27018)
db = conn.testdb

for x in range(1000000): # I never let it run past ~88k
foo = db["col%d" % x].insert(

{'t':0}

, safe=True)
if x % 100 == 0: print "%6d: %r" % (x, foo) # Optional obviously

  1. EOF

Essentially the code inserts a single document containing

{t:0}

into a series of new collections.

I used the following code to remove some collections:

  1. Approximately 100 collections before where I killed it.
    for x in range(87635, 1000000):
    db.drop_collection(db["col%d" % x])
  1. EOF

End stats:
Namespaces: 175273
heap_usage_bytes : 1701860208

Stats at peak number of collections:
namespaces: 175509
heap_usage_bytes: 1702426528

Heap directly after removal script ended:
1703393952

I did do a few other actions on the server (creating another test db & collection), but hopefully nothing that would drastically affect the statistics.

If this is normal, could the Lots of Collections wiki page simply be updated with some indication as to how much RSS memory namespaces will use?

I'm trying to evaluate using lots of collections vs. a large collection with a heavily used index.


Generated at Thu Feb 08 02:55:44 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.