Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-9674

Replication related memory leak, exposed by server side js was Authenticated Connections leaking memory or aggregation leak.

    • Type: Icon: Bug Bug
    • Resolution: Cannot Reproduce
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 2.4.3
    • Component/s: Security
    • Labels:
      None
    • Environment:
      Linux x86_64, replica set, auth turned on
    • ALL
    • Hide

      Drive a couple thousand connections to a repl set primary. Authenticate. Let the connections close. When all the connections are closed you will notice the resident size and non-mapped memory creeping up

      Using the following python and shell scripts to build a connection, authenticate, then repeat:

      #reader.py
       HOST = 'localhost'
       PORT = 27011
       conn = pymongo.Connection(host=HOST,port=PORT)
       db = conn.cranky
            for i in range(1):
      	     doc = db.foo.find().where("this.field1 == " + str(int(i * i)) +";")
      	     getl = list(doc)
            conn.disconnect()
      

      Ran on 8 AWS guests to drive connections on the Primary:

      #!/bin/bash
      while [ 2 ]; do
      	for i in $(seq 200); do
      		python reader.py &
      	done
      	echo Waiting....
      	wait
      	echo Collected!
      	sleep 20
      done
      
      Show
      Drive a couple thousand connections to a repl set primary. Authenticate. Let the connections close. When all the connections are closed you will notice the resident size and non-mapped memory creeping up Using the following python and shell scripts to build a connection, authenticate, then repeat: #reader.py HOST = 'localhost' PORT = 27011 conn = pymongo.Connection(host=HOST,port=PORT) db = conn.cranky for i in range(1): doc = db.foo.find().where( " this .field1 == " + str( int (i * i)) + ";" ) getl = list(doc) conn.disconnect() Ran on 8 AWS guests to drive connections on the Primary: #!/bin/bash while [ 2 ]; do for i in $(seq 200); do python reader.py & done echo Waiting.... wait echo Collected! sleep 20 done

      Over time something related to replication is causing the resident memory set (apparently tied to non-mapped memory) to grow large enough to cause swapping.

      After correlating different instances of this issue, we found that the easiest way to reproduce is to run a workload with server side javascript (map reduce, $where, group(), and eval() ). When v8 instance are instantiated, the non-mapped memory will balloon, this isn't the problem; though it is frequently the first thing that people notice. After running for a while you can see the non-mapped memory has increased over time (checking at times where there was no js running).

      I don't think it is an allocation issue around v8 as it doesn't happen without replication.

      You can see in the mongostat output how the non-mapped memory creeps up after each test run (Connections and memory burst up during a test run, then connections drop to 1 [mongostat] but the non-mapped memory increases a little each time):

      insert  query update delete getmore command flushes mapped  vsize    res non-mapped faults  locked db idx miss %     qr|qw   ar|aw  netIn netOut  conn repl       time
          *0     *0     *0     *0       0     0|0       0  1.09g  2.72g    33m      1.63g      0 local:0.0%          0       0|0     0|0    12b   576b     1  REC   20:43:13
          *0     *0     *0     *0       0     0|0       0  1.09g  2.72g    33m      1.63g      0 local:0.0%          0       0|0     0|0    12b   576b     1  REC   20:43:18
          *0     *0     *0     *0       0  3828|0       0  1.09g  4.87g   119m      3.77g      0 admin:0.0%          0       0|0     0|0   341k   319k  2124  REC   20:43:38
          *0     *0     *0     *0       0  1136|0       0  1.09g  3.49g   107m       2.4g      0 admin:0.0%          0       0|0     0|0   102k    95k   724  REC   20:43:43
          *0     *0     *0     *0       0  3351|0       0  1.09g  3.32g   110m      2.23g      0 admin:0.0%          0       0|0     1|0   299k   279k   539  REC   20:43:48
          *0     *0     *0     *0       0   772|0       0  1.09g  2.82g   102m      1.73g      0 admin:0.0%          0       0|0     0|0    69k    64k     1  REC   20:44:23
          *0     *0     *0     *0       0     0|0       0  1.09g  2.82g   102m      1.73g      0 local:0.0%          0       0|0     0|0    12b   576b     1  REC   20:44:28
          *0     *0     *0     *0       0     0|0       0  1.09g  2.82g   102m      1.73g      0 local:0.0%          0       0|0     0|0    12b   576b     1  REC   20:44:33
          *0     *0     *0     *0       0  3905|0       0  1.09g  5.22g   144m      4.13g      0 admin:0.0%          0       0|0     0|0   348k   325k  2467  REC   20:44:58
          *0     *0     *0     *0       0   959|0       0  1.09g   3.3g   113m      2.21g      0 admin:0.0%          0       0|0     0|0    86k    80k   520  REC   20:45:03
          *0     *0     *0     *0       0  3903|0       0  1.09g  3.22g   116m      2.13g      0 admin:0.1%          0       0|0     0|0   348k   325k   439  REC   20:45:08
          *0     *0     *0     *0       0  1634|0       0  1.09g  4.34g   136m      3.25g      0 admin:0.0%          0       0|0     2|0   145k   136k  1575  REC   20:45:13
          *0     *0     *0     *0       0  1328|0       0  1.09g   4.3g   124m      3.21g      0 admin:0.0%          0       0|0     1|0   118k   111k  1540  REC   20:45:18
          *0     *0     *0     *0       0     0|0       0  1.09g  2.83g   110m      1.74g      0 local:0.0%          0       0|0     0|0    12b   576b     1  REC   20:45:48
          *0     *0     *0     *0       0     0|0       0  1.09g  2.83g   110m      1.74g      0 local:0.0%          0       0|0     0|0    12b   576b     1  REC   20:45:53
          *0     *0     *0     *0       0  3527|0       0  1.09g  9.36g   286m      8.27g      0 admin:0.0%          0       2|0     0|0   315k   294k  6540  REC   20:46:54
          *0     *0     *0     *0       0  1458|0       0  1.09g  2.92g   207m      1.83g      0 admin:0.0%          0       0|0     0|0   130k   122k     1  REC   20:46:59
          *0     *0     *0     *0       0  4003|0       0  1.09g  9.42g   282m      8.33g      0 admin:0.0%          0       0|0     4|0   357k   333k  6645  REC   20:48:19
          *0     *0     *0     *0       0  3425|0       0  1.09g  9.78g   315m      8.68g      0 admin:0.0%          0       0|0     0|0   306k   286k  7003  REC   20:48:24
          *0     *0     *0     *0       0     0|0       0  1.09g  2.95g   229m      1.85g      0 local:0.0%          0       0|0     0|0    12b   576b     1  REC   20:49:04
          *0     *0     *0     *0       0  3825|0       0  1.09g  9.47g   296m      8.38g      0 admin:0.0%          0       0|0     0|0   341k   319k  6652  REC   20:49:59
          *0     *0     *0     *0       0     0|0       0  1.09g  2.95g   227m      1.85g      0 local:0.0%          0       0|0     0|0    12b   576b     1  REC   20:50:30
          *0     *0     *0     *0       0  3412|0       0  1.09g    13g   358m      11.9g      0 admin:0.0%          0       0|0    10|0   304k   284k 10281  REC   20:53:31
          *0     *0     *0     *0       0   305|0       0  1.09g  2.97g   252m      1.88g      0 admin:0.0%          0       0|0     0|0    27k    26k     1  REC   20:56:01
      

      I pulled out the highlights from a longer run so it's easier to spot

            Assignee:
            michael.grundy Michael Grundy
            Reporter:
            michael.grundy Michael Grundy
            Votes:
            2 Vote for this issue
            Watchers:
            10 Start watching this issue

              Created:
              Updated:
              Resolved: