[SERVER-9415] MongoDB only uses small amount of RAM Created: 22/Apr/13  Updated: 26/Sep/17  Resolved: 03/Dec/13

Status: Closed
Project: Core Server
Component/s: Performance
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Blocker - P1
Reporter: jeff liu Assignee: Adam Comerford
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

4-CPU 16GB RAM CenOS 6.2 VM


Issue Links:
Duplicate
duplicates SERVER-10602 Server does not show right resident m... Closed
Related
related to DOCS-905 Preheating Cache/Database Closed
Operating System: ALL
Participants:
Case:

 Description   

I've a single Mongo instance running on a 4-CPU 16GB RAM CenOS 6.2 VM. It's running very slowly mostly of the time. Here is the mongostat output:

insert  query update delete getmore command flushes mapped  vsize    res        faults       locked db         idx miss %     qr|qw   ar|aw  netIn netOut  conn repl       time
    *0     62     12        *0           0             9|0               0          82g   174g       2.61g     10       gowmain:13.2%          0            52|0     1|5      12k    64k     315  PRI   14:41:27
    *0     90      26       *0           0            21|0               0         82g   174g       2.61g     12       gowmain:56.0%          0            63|0     0|1      23k   145k    315  PRI   14:41:29
    *0    157     38       *0           0            19|0              1         82g   174g        2.6g        44       gowmain:56.4%         0            61|0     1|0      36k   330k    315  PRI   14:41:31
    *0    372     95       *0           0            62|0              0         82g   174g       2.61g      141      gowmain:38.1%         0            53|0     1|0     208k    1m    315  PRI   14:41:32
    *0    303     58       *0           0            31|0              0         82g   174g       2.62g       43       gowmain:33.5%         0            52|0     1|0      68k    429k    315  PRI   14:41:33
    *0    277     43       *0           0            17|0              0         82g   174g       2.61g       53       gowmain:21.8%          0            55|0     0|1    103k   1m      315  PRI   14:41:34
    *0    176     24       *0           0             9|0               0         82g   174g       2.61g       17       gowmain:63.8%          0            58|0     1|0     48k     581k   315  PRI   14:41:35
    *0    506     79       *0           0            38|0              0         82g   174g        2.6g        91       gowmain:47.6%          0            62|0     1|0     143k   852k   316  PRI   14:41:37
    *0    236     39       *0           0            15|0              0         82g   174g       2.61g       95       gowmain:21.2%          0            55|1     0|1     86k    1m      316  PRI   14:41:38
    *0    195     41       *0           0            15|0              0         82g   174g        2.6g       86       gowmain:28.3%          0            50|1     2|0       89k   1m      316  PRI   14:41:40
 

The command line options are: --rest --fork --master --oplogSize=22770 --port 27017 --dbpath /data/mongodb/dbdata/ --logpath /data/logs/mongodb.log --auth

Here is the top output:

top - 15:05:58 up 18 days, 21:03,  2 users,  load average: 2.37, 2.79, 2.83
Tasks: 121 total,   1 running, 120 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.5%us,  0.5%sy,  0.0%ni, 74.9%id, 24.1%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:  16139308k total, 15988616k used,   150692k free,    11336k buffers
Swap:  4194296k total,      536k used,  4193760k free, 14952988k cached
 
  PID   USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+   COMMAND
21253  root        20    0   173g  2.6g   2.4g  S    7.0        17.0   468:05.07  mongod

The weird thing is MongoDB consistently only consumes only 2.6G of RAM! However, the output from 'top' shows all 16GB RAM is used up.

A couple of other observations:

  • The CPU load from 'top' is usually between 2.00 to 3.00
  • The lock% range between 40 to 80.
  • When the read queue goes up to 150 to 200, the system is unusable.
  • The number of faults is not too bad, but iostat shows 100% IO itilization on the mongo partition. Also, sometimes the await time in iostat is more than 1000ms!!!

The DB has one large collection having about 50 million documents; all other collections are small. The DB serves about 2000 users frontended by Tomcat (which has almost no load).

I cannot figure out why it uses less than 3GB of RAM while the system has 16GB.
Are we doing something obviously wrong here?



 Comments   
Comment by David Brailovsky [ 06/Dec/13 ]

Any plans to "fix" this issue? That is, have a good way of understanding how much RAM mongodb is actually using?

Comment by Adam Comerford [ 03/Dec/13 ]

Going to resolve this as "Works as Designed" - this is not fixed, but we're pretty happy that we have proven that data is actually still in RAM and that this is basically a reporting issue due to how journaling works rather than any data actually being evicted from memory.

Comment by Adam Comerford [ 08/May/13 ]

Hi Jim,

It means, that at least in our testing, that the low resident memory "cliff" we saw was essentially a red herring. Even with resident memory extremely low, subsequent operations on the data revealed that it was in fact still in RAM (in the filesystem cache), hence when we accessed the data the page faults stayed low and speed was not an issue.

With a slow system, then the resident memory behavior is likely focusing attention in the wrong place. Aside from the low resident memory, take a look at faulting (mongostat/MMS) and in particular combine that with disk IO metrics (IOStat and munin-node in MMS) to see if you are hitting disk a lot, and if you are, how performant the disk is. Next up would be lock percentage - if you are over-subscribing a database, that can also cause slowness.

Your blockdev output shows a readahead setting of 256 - that is 128k. Unless you have massive documents, that won't be terribly efficient loading randomly distributed documents into the filesystem cache. For most cases I would recommend lowering that to 32 (16k) unless you have those aforementioned large documents. Note that RA changes require a process restart to take effect. I would look at tweaks here before throwing more RAM at the issue - at least you will know you are efficiently using RAM before you look at adding more.

Adam

Comment by Jim Gow [ 07/May/13 ]

Adam, thanks for the update. Does this mean everything is normal and there is nothing else we need to do?

Now the system is rather slow. To improve that, should we add more RAM, or start sharding? As I mentioned earlier, our single largest collection has about 50 million documents.

Jim

Comment by Adam Comerford [ 06/May/13 ]

Update here, we have figured out that the drop off in resident memory when you introduce writes into the equation is down to how we implement journaling, and the private remapping of memory. For more info on that, take a look here:

http://www.kchodorow.com/blog/2012/10/04/how-mongodbs-journaling-works/

The good news is that we can also confirm that the data is still in memory (in the filesystem cache), it is just not counted against the mongod process at that point. Subsequent reads of the data have produced no hard page faults, meaning that the data were still available in memory when requested.

This is essentially the same state data is in when you run the touch command - it makes the readahead system call and pages the data requested into the filesystem cache, however (as with resident memory post-remapping) the data is not assigned to the mongod process specifically and does not count toward the resident memory of that process. Once an explain() or similar read behavior is run to actively access a large section of data, resident memory will increase.

Adam

Comment by Adam Comerford [ 23/Apr/13 ]

The good news here is that we have recreated the behavior (a large amount of resident memory immediately dropping off when a workload is introduced). Once we have more information on this one, we well share it here.

Adam.

Comment by Adam Comerford [ 23/Apr/13 ]

From the mongodb-user thread:

Initially we have this node setup as a master (with oplog) in anticipation of adding slaves. Since this problem started, I have restarted it as a standalone with no oplog. But the behavior did not change.

As I mentioned earlier, after I temporarily disabled the write operations, mongod started to consume a lot more memory. This part puzzles me.

Comment by jeff liu [ 22/Apr/13 ]

One more observation. I tried to modify our application so that the most frequent write operations are temporarily disabled, and then mongod is able to consume around 11GB of RAM. Once I re-enabled those write operations in the app, mongod RAM consumption goes down to 2.6GB immediately.

The output from "blockdev --report" is:

 
RO    RA   SSZ   BSZ   StartSec            Size   Device
rw   256   512  4096          0     16106127360   /dev/xvda2
rw   256   512  4096          0      4294967296   /dev/xvda1
rw   256   512  4096          0    237296943104   /dev/xvda3
rw   256   512  4096          0    536870912000   /dev/xvdd
rw   256   512  4096         63    536863993344   /dev/xvdd1

The data is on /dev/xvdd1, which is a virtualized storage volume. The journal and logs are on /dev/xvda3, which is RAID5.

 
Server Status ({serverStatus:1})
host "SNDA-172-17-15-229"
version "2.4.1"
process "mongod"
pid 21253
uptime 291421s (4days)
uptimeMillis 291420819
uptimeEstimate 287808
localTime ISODate("2013-04-22T02: 40: 07.797Z")
asserts {
   "regular": 0,
   "warning": 0,
   "msg": 0,
   "user": 7,
   "rollovers": 0
}
backgroundFlushing {
   "flushes": 4856,
   "total_ms": 34321532,
   "average_ms": 7067.8607907743,
   "last_ms": 8191,
   "last_finished": ISODate("2013-04-22T02: 39: 18.525Z")
}
connections {
   "current": 360,
   "available": 19640,
   "totalCreated": 10556
}
cursors {
   "totalOpen": 34,
   "clientCursors_size": 34,
   "timedOut": 7,
   "totalNoTimeout": 34
}
dur {
   "commits": 19,
   "journaledMB": 0.720896,
   "writeToDataFilesMB": 1.407708,
   "compression": 0.50751883933902,
   "commitsInWriteLock": 0,
   "earlyCommits": 0,
   "timeMs": {
     "dt": 3009,
     "prepLogBuffer": 0,
     "writeToJournal": 431,
     "writeToDataFiles": 7,
     "remapPrivateView": 13
  }
}
extra_info {
   "note": "fields vary by platform",
   "heap_usage_bytes": 661639488,
   "page_faults": 9748682
}
globalLock {
   "totalTime": 291420820000,
   "lockTime": 10313245284,
   "currentQueue": {
     "total": 129,
     "readers": 129,
     "writers": 0
  },
   "activeClients": {
     "total": 1,
     "readers": 0,
     "writers": 1
  }
}
indexCounters {
   "accesses": 956952874,
   "hits": 956955401,
   "misses": 0,
   "resets": 0,
   "missRatio": 0
}
locks {
   ".": {
     "timeLockedMicros": {
       "R": 141085343,
       "W": 10313245284
    },
     "timeAcquiringMicros": {
       "R": 34144756207,
       "W": 19853988663
    }
  },
   "admin": {
     "timeLockedMicros": {
       "r": 2283390,
       "w": 0
    },
     "timeAcquiringMicros": {
       "r": 30879057,
       "w": 0
    }
  },
   "local": {
     "timeLockedMicros": {
       "r": 26025792,
       "w": 6098226123
    },
     "timeAcquiringMicros": {
       "r": 1738336013,
       "w": 39292134
    }
  },
   "auramain": {
     "timeLockedMicros": {
       "r": 630149069644,
       "w": 128983189636
    },
     "timeAcquiringMicros": {
       "r": 7395092559836,
       "w": 6530287330569
    }
  },
   "auramian": {
     "timeLockedMicros": {
       "r": 519890,
       "w": 2185
    },
     "timeAcquiringMicros": {
       "r": 2356248,
       "w": 41110
    }
  },
   "$CLUSTER": {
     "timeLockedMicros": {
       "r": 650696,
       "w": 0
    },
     "timeAcquiringMicros": {
       "r": 2197650,
       "w": 0
    }
  },
   "config": {
     "timeLockedMicros": {
       "r": 706371,
       "w": 4492
    },
     "timeAcquiringMicros": {
       "r": 4402767,
       "w": 102955
    }
  },
   "$SERVER": {
     "timeLockedMicros": {
       "r": 322925,
       "w": 0
    },
     "timeAcquiringMicros": {
       "r": 511625,
       "w": 0
    }
  }
}
network {
   "bytesIn": 15876428778,
   "bytesOut": 58291032394,
   "numRequests": 51597472
}
opcounters {
   "insert": 25689,
   "query": 34541835,
   "update": 11720972,
   "delete": 287,
   "getmore": 381,
   "command": 5339925
}
opcountersRepl {
   "insert": 0,
   "query": 0,
   "update": 0,
   "delete": 0,
   "getmore": 0,
   "command": 0
}
recordStats {
   "accessesNotInMemory": 5053599,
   "pageFaultExceptionsThrown": 1169610,
   "$CLUSTER": {
     "accessesNotInMemory": 0,
     "pageFaultExceptionsThrown": 0
  },
   "$SERVER": {
     "accessesNotInMemory": 0,
     "pageFaultExceptionsThrown": 0
  },
   "admin": {
     "accessesNotInMemory": 3,
     "pageFaultExceptionsThrown": 0
  },
   "auramain": {
     "accessesNotInMemory": 5051888,
     "pageFaultExceptionsThrown": 1169620
  },
   "auramian": {
     "accessesNotInMemory": 0,
     "pageFaultExceptionsThrown": 0
  },
   "config": {
     "accessesNotInMemory": 0,
     "pageFaultExceptionsThrown": 0
  },
   "local": {
     "accessesNotInMemory": 1740,
     "pageFaultExceptionsThrown": 0
  }
}
repl {
   "ismaster": true
}
writeBacksQueued false
mem {
   "bits": 64,
   "resident": 2999,
   "virtual": 187336,
   "supported": true,
   "mapped": 86023,
   "mappedWithJournal": 172046
}
metrics {
   "document": {
     "deleted": 282,
     "inserted": 25689,
     "returned": 11497043,
     "updated": 11720918
  },
   "getLastError": {
     "wtime": {
       "num": 0,
       "totalMillis": 0
    },
     "wtimeouts": 0
  },
   "operation": {
     "fastmod": 1493574,
     "idhack": 38576677,
     "scanAndOrder": 562
  },
   "queryExecutor": {
     "scanned": 3405594650
  },
   "record": {
     "moves": 1081680
  },
   "repl": {
     "apply": {
       "batches": {
         "num": 0,
         "totalMillis": 0
      },
       "ops": 0
    },
     "buffer": {
       "count": 0,
       "maxSizeBytes": 268435456,
       "sizeBytes": 0
    },
     "network": {
       "bytes": 0,
       "getmores": {
         "num": 0,
         "totalMillis": 0
      },
       "ops": 0,
       "readersCreated": 0
    },
     "oplog": {
       "insert": {
         "num": 11756881,
         "totalMillis": 5729791
      },
       "insertBytes": 12103503894
    },
     "preload": {
       "docs": {
         "num": 0,
         "totalMillis": 0
      },
       "indexes": {
         "num": 0,
         "totalMillis": 0
      }
    }
  },
   "ttl": {
     "deletedDocuments": 0,
     "passes": 4822
  }
}
timing {
   "after basic": 0,
   "after asserts": 0,
   "after backgroundFlushing": 0,
   "after connections": 0,
   "after cursors": 0,
   "after dur": 0,
   "after extra_info": 0,
   "after globalLock": 0..
 

Comment by Daniel Pasette (Inactive) [ 22/Apr/13 ]

Hi Jeff, can you tell me what your average document size is on your largest database? Can you post db.stats()? As shown by free, your cache is full, however, mongodb only counts the documents that it has accessed, so if readahead is high, you might be filling your cache with documents you haven't accessed.

Comment by jeff liu [ 22/Apr/13 ]

free -tm gives the following:

             total       used       free     shared    buffers     cached
Mem:         15761      15609        152          0         11      14563
-/+ buffers/cache:       1034      14726
Swap:         4095          0       4095
Total:       19857      15609       4247

Generated at Thu Feb 08 03:20:21 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.