[SERVER-4301] Why does Mongo become slower when RAM is about used up Created: 17/Nov/11  Updated: 15/Aug/12  Resolved: 04/Mar/12

Status: Closed
Project: Core Server
Component/s: Internal Client, Performance, Testing Infrastructure
Affects Version/s: 2.0.0
Fix Version/s: None

Type: Question Priority: Major - P3
Reporter: wei lu Assignee: Tad Marshall
Resolution: Duplicate Votes: 0
Labels: Windows, insert
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

4 shards on Windows Server 2003, not replica set but single node is used as shard, C++ driver: aposto/mongodb-cxx-windows-driver (https://github.com/aposto/mongodb-cxx-windows-driver/contributors) compiled on VS2005.


Attachments: Microsoft Word MongoDB Cluster Performance.docx     PDF File MongoDB Cluster Performance.pdf     Text File MongoDBThread.cpp     File MongoTestData.dat     Zip Archive cp.zip     JPEG File empty.JPG     JPEG File empty.JPG    
Issue Links:
Duplicate
duplicates SERVER-5194 Windows version of mongod should mana... Closed
Participants:

 Description   

When I continuously insert/select/update a collection, one thread for each operation, the whole performance (opcount) drops when one of the shards' RAM is about used up.
The _id is increased from 0 for insert, and select/update operations are based on random _id. I know that the performance is doomed to drop. But I want to know the reasons.
1. Since MongoDB flushes data from memory to disk once a minute, although RAM is about used up, only some dirty data in the memory need to be flushed, right? We call the dirty data "new data". Since OS's memory management is used, LRU algorithm will free up memory space for new data, because the majority of the used memory is not fresh data and can be simply destroyed without losing any information. So, no more data swap is caused by insert when RAM is used up and I guess insert performance should NOT drop, but why did it??? Is it caused by cost of memory allocation (internal operation of insert) or just disk flush or something else???
2. I can understand that more disk read may be needed when the memory is near limitation, because the work set is larger than the memory size, so some of data will need be loaded from disk from time to time. When the ratio of working set size over the RAM size increases, the rate of disk read need grow as well, since more frequent data swap is needed. Is that right???
3. On the other hand, the non-optimized global lock of MongoDB will block read and other write operations when one write is slow or even blocked. So, I think when RAM is almost used up, more global lock caused by slow insert will further cause the whole read&write performance drop, right??? BTW, will read lock block write operations?

I also attache a test file here, and it is so appreciated that you can give me some help. You can quick jump to the "Observe the cluster" part to get direct information of my questions.



 Comments   
Comment by Tad Marshall [ 04/Mar/12 ]

Consolidating closely related tickets into one.

Comment by Tad Marshall [ 22/Dec/11 ]

I tested running xperf as described above and at first glance I can't see that it reveals anything. You will get more useful results if you limit the time interval as much as possible and don't run anything else on the system while xperf is collecting data. In 45 minutes of logging, I created a 4.5 GB trace file but because I was using my system while the test was running, it shows all my activity, not just mongod.exe. I'm not sure if xperf will be helpful in diagnosing the memory issues.

Comment by wei lu [ 05/Dec/11 ]

I am just back from a short vacation, and sorry for the late responding.
It is right that I am using Windows Server 2003 as server. Are there special tricks I need to take care as for Server 2003, because it is specially pointed out in the question?

Comment by Tad Marshall [ 01/Dec/11 ]

In starting to talk to Microsoft about these issues (memory usage, degraded performance) they had some suggestions for research:

Can you please confirm what version of server your customers are using? Is this on Windows Server 2003?

Below is the info on how to get tracing info – this should help investigate the issue

1) Install Xperf on the server with the issue. (it comes with the Windows Performance Toolkit from the Windows SDK)

2) On a command prompt, launch the following command
a. Xperf.exe –on base+latency+fileio –stackwalk profile+filewrite+fileread+fileflush

3) Reproduce the issue you’re facing

4) Run the following command:
a. Xperf.exe –d mytrace.etl

5) View the trace: xperf.exe mytrace.etl

I haven't tried this yet, but I will and will report what I learn. You could try it on your machines and see if it tells you something useful, thanks!

Comment by wei lu [ 30/Nov/11 ]

The "used up" means that res memory size is about to reach the physical RAM size, and page swap does indeed occurs. The thing is, then more pages swapped in this scenario, the whole performance dropped, as I mentioned in the document.
I did a test by calling EmptyWorkingSet function when the res memory reaches 6G. A chart illustrating performance is attached here.

Comment by Tad Marshall [ 30/Nov/11 ]

Thanks for your note. You may be deeper into the relevant code than I am at this point. But it makes little sense that a call to SetProcessWorkingSetSize() should be able to free memory that we have "locked" somehow, and if we haven't "locked" it (I'm using the word "locked" casually, there may not be any explicit lock) then I don't understand why it isn't just "taken" by Windows when it needs physical RAM. Memory should not be "used up" when it can simply be paged out to a file.

You may be right that we need to explicitly control the size of the views that we create. A theory we have had is that the OS knows how to manage memory and we should let it do its job, but we may need to revisit that logic. I'd like to dig a little deeper (maybe catch up with you!) before we start to change code. If you have specific bits of code that you'd like to direct me to, that would be great.

I'm halfway into debugging a different problem with memory mapped files on Windows and will be distracted by that for a day or two more, but then I'd like to get to the bottom of the out-of-memory issues that you and others have seen on Windows. Both performance and stability are not what they should be under some workloads, and we can and should fix it.

Comment by wei lu [ 30/Nov/11 ]

Thank you for your responding. Actually, I read the source codes of "db_10.sln" recently, especially codes relavent to Memory mapped files. I find that views of a file is mapped into memory address when we insert or select a document. However, the memory is not unmapped until we shutdown the process. So the memory is used up as more and more documents are affected. I think, if memory pages which are not affected for a long time are unmapped, the memory usage may be better managed. But I don't know whether it is possible to do that...
What I currently do is to just empty the working set of "mongod" process when the res memory is up to a threshold, (EmptyWorkingSet function is called by another process).

Comment by Tad Marshall [ 29/Nov/11 ]

Sorry for the delay in responding. I have an ongoing project to get to the bottom of multiple issues with memory usage on Windows. Unless I can determine something that we are not doing right, we may need to follow up with Microsoft to determine what is wrong.

The short story is that our use of memory mapped files should not present Windows with the kind of memory demand that we see.

When a normal process uses memory, it just allocates it from a pool using an API that eventually turns into a call to HeapAlloc(), or it reserves private memory for itself using VirtualAlloc() and then "commits" it by writing into it. In both of these cases, to reclaim the physical memory, Windows must page out that data to the page file.

This is not the case for MongoDB's memory mapped files. When Windows needs physical RAM for any process, it can page our memory mapped file out to the file itself and free the memory that way. Memory mapped files do not consume page file space: they act as their own page files.

This doesn't work the way it should under load. Somehow, Windows ends up consuming more memory, leaving less for other processes, and even leaving less for MongoDB itself. Under extreme load, "free" memory dwindles until everything becomes dog slow and things start to fail.

This isn't a very good answer and I apologize for that, but my research so far hasn't found the precise place where a fix can be made, by either us or Microsoft. I'll update the bug with more and better information as soon as I can, thank you for your patience!

Comment by wei lu [ 17/Nov/11 ]

cp.zip contains executable file, but I am not sure whether it works on your machine. You can run the python script;
MongoDBThread.cpp is the source code file describes how APIs are called: Ln93 select, Ln 230 insert, Ln 305 update.
MongoTestData.dat contains the content of the document I inserted in the test. All documents are the same. In UPDATE, I add another field to the document.

Comment by Tad Marshall [ 17/Nov/11 ]

If this is using dummy (non-proprietary) data, would it be possible for you to upload your code so that we could reproduce your results? Performance might be affected by document sizes and index paging, and we could give better advice if we could see the details of what is driving performance in your specific test cases. Thanks!

Generated at Thu Feb 08 03:05:34 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.