[SERVER-38085] mongod occupied a lot cpu without any workload Created: 12/Nov/18  Updated: 30/Nov/18  Resolved: 30/Nov/18

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Question Priority: Major - P3
Reporter: z dd Assignee: Danny Hatcher (Inactive)
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: Text File serverStatus_output.txt    
Participants:

 Description   

I install a community mongodb cluster for test. one of the mongod member eat a lot cpu and memory without any workload, my computer is 2U4G,

from top command, cpu is 153%:

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
4637 Ruby 20 0 4060396 3.085g 6216 S 153.7 83.0 287288:54 mongod

from top -H:

PID   USER PR NI VIRT RES SHR S         %CPU %MEM TIME+ COMMAND
4644 Ruby 20 0 4060396 3.086g 6216 S 41.7 83.0 69836:17 mongod
4645 Ruby 20 0 4060396 3.086g 6216 R 38.3 83.0 69850:52 mongod
4643 Ruby 20 0 4060396 3.086g 6216 R 35.7 83.0 69847:28 mongod
4642 Ruby 20 0 4060396 3.086g 6216 S 34.3 83.0 69895:33 mongod

 and I grap those 4 threads stack:

Thread 76 (Thread 0x7f552fdb6700 (LWP 4642)):
#0 0x00007f55339e0a82 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x0000000001c6b50c in __wt_cond_wait_signal ()
#2 0x0000000001c442d2 in __wt_evict_thread_run ()
#3 0x0000000001caa986 in __wt_thread_run ()
#4 0x00007f55339dcdc5 in start_thread () from /lib64/libpthread.so.0
#5 0x00007f553370a81d in lseek64 () from /lib64/libc.so.6
#6 0x0000000000000000 in ?? ()
Thread 75 (Thread 0x7f552f5b5700 (LWP 4643)):
#0 0x0000000001be0a1e in ?? ()
#1 0x0000000001be0a96 in ?? ()
#2 0x0000000001be0f7c in ?? ()
#3 0x0000000001bf8c58 in __wt_split_rewrite ()
#4 0x0000000001c4744a in __wt_evict ()
#5 0x0000000001c42003 in ?? ()
#6 0x0000000001c42387 in ?? ()
#7 0x0000000001c43ca0 in __wt_evict_thread_run ()
#8 0x0000000001caa986 in __wt_thread_run ()
#9 0x00007f55339dcdc5 in start_thread () from /lib64/libpthread.so.0
#10 0x00007f553370a81d in lseek64 () from /lib64/libc.so.6
#11 0x0000000000000000 in ?? ()
Thread 74 (Thread 0x7f552edb4700 (LWP 4644)):
#0 0x00007f55339e0a82 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x0000000001c6b50c in __wt_cond_wait_signal ()
#2 0x0000000001c423e8 in ?? ()
#3 0x0000000001c43ca0 in __wt_evict_thread_run ()
#4 0x0000000001caa986 in __wt_thread_run ()
#5 0x00007f55339dcdc5 in start_thread () from /lib64/libpthread.so.0
#6 0x00007f553370a81d in lseek64 () from /lib64/libc.so.6
#7 0x0000000000000000 in ?? ()
Thread 73 (Thread 0x7f552e5b3700 (LWP 4645)):
#0 0x00007f55339e0a82 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x0000000001c6b50c in __wt_cond_wait_signal ()
#2 0x0000000001c423e8 in ?? ()
#3 0x0000000001c43ca0 in __wt_evict_thread_run ()
#4 0x0000000001caa986 in __wt_thread_run ()
#5 0x00007f55339dcdc5 in start_thread () from /lib64/libpthread.so.0
#6 0x00007f553370a81d in lseek64 () from /lib64/libc.so.6
#7 0x0000000000000000 in ?? ()

 

Is it exepected?   

mongodb version is 3.4.14 



 Comments   
Comment by Danny Hatcher (Inactive) [ 30/Nov/18 ]

Hello,

As I have not heard back from you and there does not appear to be a bug in the MongoDB server, I will now close this ticket.

Thank you,

Danny

Comment by Danny Hatcher (Inactive) [ 16/Nov/18 ]

Hello Zhang,

Looking at the server status output, the server has been running for 5 months, it is a member of a sharded cluster, and there is significant load passing through it. Either the server status has come from the wrong server or this server is performing as expected.

If you download a new MongoDB binary from our Download Center and run it on a server with no other traffic, I do not expect to see high CPU utilization. If you still do after following those steps, please upload the content of the diagnostic.data folder (from that new binary's dbpath) to this ticket.

Thank you very much,

Danny

Comment by z dd [ 16/Nov/18 ]

Hi,

I create two instance with different MongoDB version, 3.2.18 and 3.4.14, and hit the same issue. At least, these phenomenon are the same.  I am sorry for mismatch that info. "3.2.18.5" is corresponding 3.2.18 MongoDB release version. 

Thank you very much

Zhang

 

Comment by Danny Hatcher (Inactive) [ 14/Nov/18 ]

Hello,

You mentioned that you are running 3.4.14. However, the server status output returns "3.2.18.5" which is not a released version of MongoDB. Can you please confirm what version of MongoDB you are running? If it is a custom build, are you able to check against a standard release of MongoDB to see if the issue still occurs?

Thank you very much,

Danny

Comment by z dd [ 13/Nov/18 ]

from serverStatus command output, cache looks like abnormal

"cache" : {
"application threads page read from disk to cache count" : 9,
"application threads page read from disk to cache time (usecs)" : 12844,
"application threads page write from cache to disk count" : 15,
"application threads page write from cache to disk time (usecs)" : 304,
"bytes belonging to page images in the cache" : 24570827,
"bytes currently in the cache" : 3316502578,
"bytes not belonging to page images in the cache" : 3291931751,
"bytes read into cache" : 60683,
"bytes written from cache" : NumberLong("3524340341406"),
"checkpoint blocked page eviction" : 394332444207,
"eviction calls to get a page" : 766362080817,
"eviction calls to get a page found queue empty" : 3202541693,
"eviction calls to get a page found queue empty after locking" : 6627909703,
"eviction currently operating in aggressive mode" : 100,
"eviction empty score" : 100,
"eviction server candidate queue empty when topping up" : 6751346978,
"eviction server candidate queue not empty when topping up" : 1963877442,
"eviction server evicting pages" : 1071,
"eviction server slept, because we did not make progress with eviction" : 9055504912,
"eviction server unable to reach eviction goal" : 85586841,
"eviction state" : 15,
"eviction walks abandoned" : 844292,
"eviction worker thread active" : 0,
"eviction worker thread created" : 0,
"eviction worker thread evicting pages" : 750507350995,
"eviction worker thread removed" : 0,
"eviction worker thread stable number" : 0,

...

but i can not find checkpoint thread with pstack command.

Generated at Thu Feb 08 04:47:54 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.