[SERVER-11012] Mongo - collection gets stuck after long running client connection Created: 02/Oct/13  Updated: 10/Dec/14  Resolved: 19/Mar/14

Status: Closed
Project: Core Server
Component/s: Stability
Affects Version/s: 2.2.4
Fix Version/s: None

Type: Bug Priority: Blocker - P1
Reporter: igor lasic Assignee: Unassigned
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Linux, VMWARE, NAS

Grails 1.3.5, gmongo 0.5.0, java-driver 2.11.1


Operating System: Linux
Steps To Reproduce:

Start long running transaction and kill the web client.

Participants:

 Description   

Our application opens a long running transaction that pulls 100k records out of mongo.

Generating the report out of the application can take many minutes depending on the size of the reports.

If a user just exits the grails application without waiting for completion the collection being queried gets "stuck" and mongoDB experiences significant slowdown.

There is no indication in the mongo log that anything is awry.



 Comments   
Comment by Stennie Steneker (Inactive) [ 19/Mar/14 ]

Hi Igor,

Please be advised I'm now closing this issue as we do not have enough details to investigate the problem.

If you do have any further information that would help us reproduce this issue, please let us know.

Thanks,
Stephen

Comment by Daniel Pasette (Inactive) [ 11/Oct/13 ]

can you run an explain() on the query you sent and post the output?

Comment by igor lasic [ 09/Oct/13 ]

I had a long running query that was sorting on "_id".

There is an index on reportSectionId and there are 170 million entries in the reportDetail collection (therefore 170M _id) entries.

My question is why didn't the sorting happen only on the entries that are chosen by the reportSectionId? Why did the scan have to look at more data?

"$query" :

{ "reportSectionId" : ObjectId("523c6698ad8dae84334b090a") }

,
"$orderby" :

{ "_id" : 1 }
Comment by Daniel Pasette (Inactive) [ 09/Oct/13 ]

The currentOp output shows that the query is still running, and has taken 518 seconds. I can't think what impact exiting the client application would have on the database. Do you have any more information?

Comment by igor lasic [ 03/Oct/13 ]

> db.currentOp()
{
"inprog" : [
{
"opid" : 107117,
"active" : true,
"secs_running" : 518,
"op" : "query",
"ns" : "ihm.reportDetail",
"query" : {
"$query" :

{ "reportSectionId" : ObjectId("523c6698ad8dae84334b090a") }

,
"$orderby" :

{ "_id" : 1 }

,
"$readPreference" :

{ "mode" : "primary" }

},
"client" : "10.84.150.155:57622",
"desc" : "conn10",
"threadId" : "0x7f4693e2d700",
"connectionId" : 10,
"locks" :

{ "^" : "r", "^ihm" : "R" }

,
"waitingForLock" : false,
"numYields" : 8638,
"lockStats" : {
"timeLockedMicros" :

{ "r" : NumberLong(1016557841), "w" : NumberLong(0) }

,
"timeAcquiringMicros" :

{ "r" : NumberLong(518868123), "w" : NumberLong(0) }

}
},

Comment by Daniel Pasette (Inactive) [ 03/Oct/13 ]

Can you post db.currentOp() during an incident?

Generated at Thu Feb 08 03:24:38 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.