[SERVER-51340] MongoDB $accumulator: MongoError: Out of memory Created: 04/Oct/20  Updated: 27/Oct/23  Resolved: 21/Oct/20

Status: Closed
Project: Core Server
Component/s: Aggregation Framework
Affects Version/s: 4.4.1
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Asaf Shifer Assignee: Dmitry Agranat
Resolution: Works as Designed Votes: 0
Labels: memory, memory-leak
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

MongoDB:
MongoDB shell version v4.4.1
Build Info: {
"version": "4.4.1",
"gitVersion": "ad91a93a5a31e175f5cbf8c69561e788bbc55ce1",
"openSSLVersion": "OpenSSL 1.1.1f 31 Mar 2020",
"modules": [],
"allocator": "tcmalloc",
"environment":

{ "distmod": "ubuntu2004", "distarch": "x86_64", "target_arch": "x86_64" }

}

Ubuntu:
lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 20.04.1 LTS
Release: 20.04
Codename: focal

Node.js:
node -v
v14.8.0

package.json:
"engines":

{ "node": "14.x" }

,
"dependencies":

{ "mongodb": "^3.6.2", }

Issue Links:
Related
related to SERVER-51404 Improve log messages when aggregation... Closed
Operating System: ALL
Steps To Reproduce:
Participants:

 Description   

I have a newly installed MongoDB server (v4.4.1) running on an AWS Ubuntu EC2. The server has nothing else installed on it. The DB currently contains 35 documents of 3 MB each, meaning less than 110 MB of data.

I'm running an aggregation query that contains 2 stages: a simple $match stage that makes sure that only 31 documents are aggregated, and a $group stage with $accumulator inside (that stage doesn't allocate much new space, and even if it did, I expect that when handling with 3 MB documents it won't be much more). The query's result should be a single 3 MB document that is a merger of all of the above.

Running the query when only 3 documents exist - finishes without a problem. But with 35 - I consistently get the following error: 

MongoError: Out of memory
at MessageStream.messageHandler (/app/node_modules/mongodb/lib/cmap/connection.js:268:20)
at MessageStream.emit (events.js:314:20)
at processIncomingData (/app/node_modules/mongodb/lib/cmap/message_stream.js:144:12)
at MessageStream._write (/app/node_modules/mongodb/lib/cmap/message_stream.js:42:5)
at writeOrBuffer (_stream_writable.js:352:12)
at MessageStream.Writable.write (_stream_writable.js:303:10)
at Socket.ondata (_stream_readable.js:717:22)
at Socket.emit (events.js:314:20)
at addChunk (_stream_readable.js:307:12)
at readableAddChunk (_stream_readable.js:282:9)
 

I've read a lot online and couldn't solve it, I've tried:

  1. Upgrading the server's hardware to t3.large (8 GB RAM).
  2. Adding allowDiskUse to the query:

 

const options = { allowDiskUse: true, // explain: true, };
mongoConn.db(dbConfig.database).collection(collection).aggregate(aggregationPipeline, options).toArray((err: MongoError, result: any) => {

 

  1. With htop, I can see the total memory usage of 210M/7.68G after restarting MongoDB, and during the query it climbs to a peak of 691M/7.68G, fails, and remains on 627M/7.68G afterward. 
  1. With db.enableFreeMonitoring() I can see a constant 2 GB virtual memory usage, with peaks to 2.1 GB: 
  1. The result of db.serverStatus().tcmalloc.tcmalloc.formattedString
  1. The result of db.serverStatus().mem

Summary: I know that MongoDB has a 100MB memory limit, but I guess that it shouldn't reach it with 3 MB documents, and allowDiskUse. What am I missing here?



 Comments   
Comment by Dmitry Agranat [ 21/Oct/20 ]

Hi asafshifer@gmail.com,

We haven’t heard back from you for some time, so I’m going to mark this ticket as resolved. If this is still an issue for you, please provide additional information and we will reopen the ticket.

Regards,
Dima

Comment by Dmitry Agranat [ 05/Oct/20 ]

Hi asafshifer@gmail.com,

The $group stage has a limit of 100 megabytes of RAM. By default, if the stage exceeds this limit, $group returns an error. To allow for the handling of large datasets, set the allowDiskUse option to true. This flag enables $group operations to write to temporary files.

Your 35 documents x 3 MB each = 105 MB, which is larger than the defined limit of 100 MB.

From the log snippet you've provided, there is no evidence the allowDiskUse option was used.

Would you please archive (tar or zip) the mongod.log files and the $dbpath/diagnostic.data directory so that we could further investigate this (the contents are described here) and upload them to this support uploader location?

Files uploaded to this portal are visible only to MongoDB employees and are routinely deleted after some time.

Thanks,
Dima

Comment by Asaf Shifer [ 04/Oct/20 ]

Found this in the logs:

  {    "t": {      "$date": "2020-10-04T12:50:00.263+00:00"    },    "s": "I",    "c": "COMMAND",    "id": 51803,    "ctx": "conn16",    "msg": "Slow query",    "attr": {      "type": "command",      "ns": "visits.performanceData_Daily",      "command": {        "aggregate": "performanceData_Daily",        "pipeline": [          {            "$match": {              "_id": {                "$gte": {                  "$date": "2020-09-04T00:00:00.000Z"                },                "$lt": {                  "$date": "2020-10-05T00:00:00.000Z"                }              }            }          },          {            "$group": {              "_id": 1,              "createdAt": {                "$min": "$createdAt"              },              "updatedAt": {                "$max": "$updatedAt"              },              "from": {                "$min": "$from"              },              "to": {                "$min": "$to"              },              "isFinal": {                "$min": "$isFinal"              },              "trackerData": {                "$accumulator": {                  "init": "function () {\n                        return {\n                            id: '1',\n                            t: '1',\n                            co: 0,\n                            i: 0,\n                            bc: 0,\n                            c2: 0,\n                            cn: 0,\n                            r: 0,\n                            c_: {},\n                        };\n                    }"                }              }            }          }        ]      },      "planSummary": "IXSCAN { _id: 1 }",      "numYields": 5,      "queryHash": "05692E90",      "planCacheKey": "EB76648C",      "ok": 0,      "errMsg": "Out of memory",      "errName": "JSInterpreterFailure",      "errCode": 139,      "reslen": 109,      "locks": {        "ReplicationStateTransition": {          "acquireCount": {            "w": 22          }        },        "Global": {          "acquireCount": {            "r": 22          }        },        "Database": {          "acquireCount": {            "r": 22          }        },        "Collection": {          "acquireCount": {            "r": 22          }        },        "Mutex": {          "acquireCount": {            "r": 17          }        }      },      "storage": {},      "protocol": "op_msg",      "durationMillis": 176094    },    "truncated": {      "command": {        "pipeline": {          "1": {            "$group": {              "trackerData": {                "$accumulator": {                  "accumulate": {                    "type": "string",                    "size": 9812                  }                }              }            }          }        }      }    },    "size": {      "command": 20539    }  },

Comment by Asaf Shifer [ 04/Oct/20 ]

/etc/mongod.conf:

# mongod.conf
# for documentation of all options, see:
# http://docs.mongodb.org/manual/reference/configuration-options/
# Where and how to store data.
storage:
 dbPath: /var/lib/mongodb
 journal:
 enabled: true
# engine:
# mmapv1:
# wiredTiger:
# where to write logging data.
systemLog:
 destination: file
 logAppend: true
 path: /var/log/mongodb/mongod.log
# network interfaces
net:
 port: 27017
 bindIp: 127.0.0.1,<IP>
 
# how the process runs
processManagement:
 timeZoneInfo: /usr/share/zoneinfo
#security:
security:
 authorization: "enabled"
#operationProfiling:
#replication:
#sharding:
## Enterprise-Only Options:
#auditLog:
#snmp:

 

Comment by Asaf Shifer [ 04/Oct/20 ]

Changing the hardware to t3.xlarge (16GB RAM) didn't solved it as well

Comment by Asaf Shifer [ 04/Oct/20 ]

The code was changed into a 1-liner, so it looks like the end of the

const options

was commented out, while it was properly closed

Generated at Thu Feb 08 05:25:13 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.