[SERVER-32180] mongod oom with low connections Created: 06/Dec/17 Updated: 26/Jan/18 Resolved: 26/Dec/17 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Performance |
| Affects Version/s: | 3.4.4 |
| Fix Version/s: | None |
| Type: | Question | Priority: | Major - P3 |
| Reporter: | shawn | Assignee: | Kelsey Schubert |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||||||
| Issue Links: |
|
||||||||
| Participants: | |||||||||
| Description |
|
hi , my mongod oom with few connections, diagnostic.data files is attached, please help me to view these files。 |
| Comments |
| Comment by Kelsey Schubert [ 26/Dec/17 ] | |
|
Hi shawn001, Thank you for providing the complete logs. From it, we can see that a large amount of memory is being utilized by geonear queries preceeding the oom. This issue is tracked in SERVER-22224. Please feel free to vote for it and watch it for updates. This stack's allocation begins increasing at 2017-12-06T16:05:29.011Z UTC and grows to above 50GB by the time of OOM.
Until SERVER-22224 is resolved, I would suggest investigating steps to take on the application layer to mitigate this issue by reducing the number of concurrent geonear queries and ensuring appropriate indexes have been established. Kind regards, | |
| Comment by shawn [ 16/Dec/17 ] | |
|
hi is there anything discovery? | |
| Comment by shawn [ 06/Dec/17 ] | |
|
Hi @Kelsey the entire file named log.log the host configure: 128GB Phyiscal mem and 11GB swap. | |
| Comment by Kelsey Schubert [ 06/Dec/17 ] | |
|
Hi shawn, To clarify, we need to see all of the log messages since the restart with heapprofiling is enabled to see the relevant stack traces recorded by the heap profiler. Without these traces, we cannot continue to our investigation. These stacks may not be first recorded at particularly noteworthy times, so it's best to provide the complete files to allow us pull out relevant information as we look into this issue. Would you please upload all mongod log files since the restart to OOM and ensure there are no gaps in its coverage so we can continue to investigate? Thank you, | |
| Comment by shawn [ 06/Dec/17 ] | |
|
Hi Bruce Lucas mongo-1207.zip Thanks | |
| Comment by Bruce Lucas (Inactive) [ 06/Dec/17 ] | |
|
Hi shawn001, Thanks for uploading the data. Unfortunately the log file ends at 2017-12-06T12:59:59.850+0800, several hours before the OOM, so it is missing some crucial information for identifying the cause. Do you have additional log files that cover the entire time from the restart until the OOM? Thanks, | |
| Comment by shawn [ 06/Dec/17 ] | |
|
hi @thomas.schubert diagnostic-new.tar.gz mongolog.tar.gz Thank you | |
| Comment by shawn [ 06/Dec/17 ] | |
|
hi , @Kelsey T Schubert got it thank you | |
| Comment by Kelsey Schubert [ 06/Dec/17 ] | |
|
Hi shawn001, Thanks for reporting this issue. I see, proceeding the OOM events, some cursors are established with noTimeouts. Are you aware of queries these cursors are running?
To help us continue to investigate this issue, would you please restart the node with the following parameter enabled
After encountering the OOM again, would you please upload the following files:
Thank you, |