[SERVER-12984] Real document size in collection Created: 01/Mar/14  Updated: 18/Sep/21  Resolved: 18/Mar/14

Status: Closed
Project: Core Server
Component/s: Diagnostics, Shell, Tools
Affects Version/s: None
Fix Version/s: None

Type: Question Priority: Major - P3
Reporter: Jan Botorek Assignee: Stennie Steneker (Inactive)
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Participants:

 Description   

Hello, I am not able to find out the real size of document stored in mongo collections. I googled, that document size may be obtained by two methods:
Object.bsonsize - some javascript method that should return a size in bytes
db.collection.stats() - where there is a line 'avgObjSize' that produce some "aggregated"(average) size view on the data. It simply represents average size of single document.

I have a really simple document with this structure:
{
test: "test",
ids: [id1, id2, id3..... id500000]
}
each id is constructed by 10 characters.

By simle computation, the overall size should be more than 5 MB.
But when I initiate the 'Object.bsonsize' command, it returns: 499.
Stats command returns 'size' = 10747888.

I am really confused. Is there any way how to reliable find out size of the particular document? Are my steps performed so far absolutely wrong?

Thank you for your support, any help will be appreciated.



 Comments   
Comment by Stennie Steneker (Inactive) [ 18/Sep/21 ]

Hi saaitha@cisco.com,

Per earlier comments on this issue you can use something like Object.bsonsize() in the MongoDB shell or the equivalent in your MongoDB driver.

However, if your documents are likely to approach the maximum document size limit you may be using a schema design anti-pattern like Massive Arrays.

I highly recommend reviewing the articles in these two schema design series:

There is also a free online course at MongoDB University: M320: Data Modeling.

Lastly: please note that the SERVER project is for reporting bugs and potential improvements for the MongoDB server.

For general discussion please start a new topic in the MongoDB Developer Community Forums.

Regards,
Stennie

Comment by Santosh Kumar Aitha [ 14/Sep/21 ]

Hi Stennie,

what is the best way to find the document size even before inserting into mongodb to prevent "

document too large" error? 

Comment by Jan Botorek [ 03/Mar/14 ]

Hello, thank you a lot for your help! Yes, now the result seems good:

>Object.bsonsize(db.test.findOne(

{test:"test"}

))
>10492282

It did not occur to me that I queried cursor instead of the real document object.
Best regards

Comment by Stennie Steneker (Inactive) [ 03/Mar/14 ]

Hi Jan,

In your bsonsize() example you are measuring the size of a find(), which returns a cursor rather than a document.

To find the size of a single document you should instead be using a findOne():

 Object.bsonsize(db.test.findOne( {test:"test"}))

Can you confirm this shows the expected document size?

Thanks,
Stephen

Comment by Jan Botorek [ 03/Mar/14 ]

Hello,
I will try to demonstrate my steps performed:

1) Create document:

{ test:"test", ids:[ "1111111111", "2222222222", "3333333333", ... ] }

There is 990000 randomly generated strings representing virtual ID. Every id is composed from 10 characters.

2) Insert the document into the database:
db.test.insert(document created in step 1) )

  • collection IS EMPTY before inserting this document

3.a) db.test.stats():
{
"ns" : "home.test",
"count" : 1,
"size" : 10747888,
"avgObjSize" : 10747888,
"storageSize" : 167882752,
"numExtents" : 2,
"nindexes" : 1,
"lastExtentSize" : 167878656,
"paddingFactor" : 1,
"systemFlags" : 1,
"userFlags" : 0,
"totalIndexSize" : 8176,
"indexSizes" :

{ "_id_" : 8176 }

,
"ok" : 1
}

  • according to the "size" value - the single document should be of a size 10,7 MB. I believe, it can be the real size of the document. But unfortunately, when there are thousands of documents in the collection, this method (to find out the size of particular document) is not usable at all.

3.b) Object.bsonsize(db.test.find(

{test:"test"}

))
returns : 460
this command I googled around - according to the comments and documentation I was able to find it should return specific size of the single document in bytes... As you can easily see, it is definitely not satisfied. There is a huge difference among these two values.

Is there any other way how to find the real size of stored documents?

Thank you for your support

Comment by Daniel Pasette (Inactive) [ 03/Mar/14 ]

can you describe the exact commands you are running and run the collection stats command?

Generated at Thu Feb 08 03:30:16 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.