[SERVER-36271] saslStart { saslStart: 1, mechanism: "SCRAM-SHA-1", payload: "xxx" } numYields:0 reslen: 155 locks:{} protocol:op_query 398387ms Created: 25/Jul/18  Updated: 27/Oct/23  Resolved: 02/Aug/18

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 3.4.9
Fix Version/s: None

Type: Question Priority: Major - P3
Reporter: shawn Assignee: Nick Brewer
Resolution: Works as Designed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: Text File pstack.log    
Participants:

 Description   

HI

One mongodb instance was killed by  oom killer,

After starting it, couldn't connect to it  by using mongo client.

Finding some logs  from  log:

2018-07-24T19:15:40.241+0800 I COMMAND [conn33] command admin.$cmd command: saslStart { saslStart: 1, mechanism: "SCRAM-SHA-1", payload: "xxx" } numYields:0 reslen:
155 locks:{} protocol:op_query 458411ms
2018-07-24T19:15:40.241+0800 I COMMAND [conn22] command admin.$cmd command: saslStart { saslStart: 1, mechanism: "SCRAM-SHA-1", payload: "xxx" } numYields:0 reslen:
155 locks:{} protocol:op_query 638498ms
2018-07-24T19:15:40.241+0800 I COMMAND [conn41] command admin.$cmd command: saslStart { saslStart: 1, mechanism: "SCRAM-SHA-1", payload: "xxx" } numYields:0 reslen:
155 locks:{} protocol:op_query 278339ms
2018-07-24T19:15:40.241+0800 I COMMAND [conn26] command admin.$cmd command: saslStart { saslStart: 1, mechanism: "SCRAM-SHA-1", payload: "xxx" } numYields:0 reslen:
155 locks:{} protocol:op_query 578470ms
2018-07-24T19:15:40.241+0800 I COMMAND [conn29] command admin.$cmd command: saslStart { saslStart: 1, mechanism: "SCRAM-SHA-1", payload: "xxx" } numYields:0 reslen:
155 locks:{} protocol:op_query 518439ms

 

mongodb  server version is  3.4.9



 Comments   
Comment by Nick Brewer [ 02/Aug/18 ]

From your responses, I do not see anything to indicate a bug in the MongoDB server. For MongoDB-related support discussion please post on the mongodb-user group or Stack Overflow with the mongodb tag. A question like this involving more discussion would be best posted on the mongodb-user group.

-Nick

Comment by shawn [ 31/Jul/18 ]

I'm sure , it isn't  cause by not having enough  available memory.

When restart mongod ,  the memory  that  mongod used is released to OS.

And, when i restart mongod,  i won't connect to mongod  rapidly still,   mongodb monitor have a execption with  "i/o timeout".

procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu-----
 r b swpd free buff cache si so bi bo in cs us sy id wa st
 3 0 38320 100060912 324388 26699388 0 0 233 2008 0 0 14 3 83 1 0
 3 1 38320 100000352 324392 26762752 0 0 0 65528 3762 1124 8 1 91 0 0
 2 0 38320 99940304 324392 26824864 0 0 0 61836 3807 1160 8 1 91 0 0
 2 0 38320 99880288 324392 26886628 0 0 0 60464 3903 1435 8 1 91 0 0

 

 

Comment by Nick Brewer [ 30/Jul/18 ]

shawn001 From what you're describing, it seems likely that the server still doesn't have enough available memory when it is brought back up, which is why the attempts to authenticate to it are taking so long, and are ultimately timing out. What does the memory utilization look like when the mongod is brought back up? The output of vmstat 1 10 would be useful.

Thanks,
Nick

Comment by shawn [ 26/Jul/18 ]

yes, these logs   just after the mongod  started.

when it  start, i connect to it, but  the connection was  blocked.

i found some lock info  from  pstack:

Thread 75 (Thread 0x7fa740b43700 (LWP 191319)):
#0 0x00007fa751ae598e in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x00007fa753ff49b8 in mongo::CondVarLockGrantNotification::wait(mongo::Duration<std::ratio<1l, 1000l> >) ()
#2 0x00007fa753ff9a1e in mongo::LockerImpl<false>::lockComplete(mongo::ResourceId, mongo::LockMode, mongo::Duration<std::ratio<1l, 1000l> >, bool) ()
#3 0x00007fa753fe8c3e in mongo::Lock::ResourceLock::lock(mongo::LockMode) ()
#4 0x00007fa753fe8cbd in mongo::Lock::GlobalLock::_enqueue(mongo::LockMode, unsigned int) ()
#5 0x00007fa753fe8d18 in mongo::Lock::GlobalLock::GlobalLock(mongo::Locker*, mongo::LockMode, unsigned int, mongo::Lock::GlobalLock::EnqueueOnly) ()
#6 0x00007fa753fe8d58 in mongo::Lock::GlobalLock::GlobalLock(mongo::Locker*, mongo::LockMode, unsigned int) ()
#7 0x00007fa753fe8de7 in mongo::Lock::DBLock::DBLock(mongo::Locker*, mongo::StringData, mongo::LockMode) ()
#8 0x00007fa75400d130 in mongo::AutoGetDb::AutoGetDb(mongo::OperationContext*, mongo::StringData, mongo::LockMode) ()
#9 0x00007fa75400dc3a in mongo::AutoGetCollection::AutoGetCollection(mongo::OperationContext*, mongo::NamespaceString const&, mongo::LockMode, mongo::LockMode, mong
o::AutoGetCollection::ViewMode) ()
#10 0x00007fa75400e27b in mongo::AutoGetCollectionForRead::AutoGetCollectionForRead(mongo::OperationContext*, mongo::NamespaceString const&, mongo::AutoGetCollection
::ViewMode) ()
#11 0x00007fa75400e37c in mongo::AutoGetCollectionOrViewForRead::AutoGetCollectionOrViewForRead(mongo::OperationContext*, mongo::NamespaceString const&) ()
#12 0x00007fa753ea3318 in mongo::GlobalCursorIdCache::timeoutCursors(mongo::OperationContext*, int) ()
#13 0x00007fa753ed375e in mongo::ClientCursorMonitor::run() ()
#14 0x00007fa754a3d851 in mongo::BackgroundJob::jobBody() ()
#15 0x00007fa7557f4a00 in execute_native_thread_routine ()
#16 0x00007fa751ae19d1 in start_thread () from /lib64/libpthread.so.0
#17 0x00007fa75182eb6d in clone () from /lib64/libc.so.6

 

detail pstack info  file  was attached as follows.

pstack.log

 

 

Comment by Nick Brewer [ 25/Jul/18 ]

shawn001 Are these logs from right after the mongod is started? What does the memory availability on the machine look like when this occurs?

Thanks,
-Nick

Generated at Thu Feb 08 04:42:36 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.