[SERVER-8485] workingSet estimate is inaccurate, stablizes at incorrect value Created: 08/Feb/13  Updated: 29/May/13  Resolved: 09/Feb/13

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Michael O'Brien Assignee: Unassigned
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
Operating System: ALL
Participants:

 Description   

Testing QA-175 for the case where working set size exceeds available RAM, but I can't seem to come up with results that make sense, not sure if this is a bug or flaw in my procedure here.
On a linux machine with these mem characteristics:

             total       used       free     shared    buffers     cached
Mem:        435132     355152      79980          0      12440     185944
-/+ buffers/cache:     156768     278364
Swap:       262140      19200     242940

I loaded up the enron data set, size is:

"count" : 501513,
"size" : 1527603312,
"avgObjSize" : 3045.9894598943597,
"storageSize" : 1605427200,

Tried to get db to access the entire data set, about 1.4gb which should exceed ram size, and thus consider the entire collection as the "working set".
To do this I queried for the whole collection repeatedly (using both true and false for snapshot, which should force the _id index to be paged in):

function query_data(snapshot){
    var counter = 0;
    if(snapshot){
        x = db.messages.find()
    }else{
        x = db.messages.find().snapshot()
    }
 
    while(x.hasNext()){
    counter++;
    var y = x.next();
    if(counter % 10000 == 0){
        print(counter)
    }
    }   
    print(counter, "documents read.")
}

Afterwards when I run workingSet, the result is 264134 pagesInMemory, which works out to 1081892864 bytes = about 1 gb, way under the actual data size (about 1.4gb).
I can't seem to get the pagesInMemory to exceed that value, tried other queries as well as .count() and db.runCommand(

{touch:"messages", data:true, index:true}

) but none seem to have an effect.



 Comments   
Comment by Tsz Ming Wong [ 29/May/13 ]

Michael, where you get the value "22442"?

Comment by Michael O'Brien [ 09/Feb/13 ]

OK, I get it now - taking time into account, estimate looks accurate.

> db.serverStatus({workingSet:1}).workingSet
{
        "note" : "thisIsAnEstimate",
        "pagesInMemory" : 264138,
        "computationTimeMicros" : 57374,
        "overSeconds" : 15843
}
> 15843 / 22442
0.7059531236075216
> db.messages.stats().size * 0.7059531236075216
1078416329.7395954
> 264138 * 4096
1081909248

Comment by Eliot Horowitz (Inactive) [ 09/Feb/13 ]

I think this is correct.
If the working set doesn't fit in ram, the time part of the result will go down.
Can you send the full serverStatus() output

Generated at Thu Feb 08 03:17:34 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.