[SERVER-19360] Out of connections when building an index Created: 10/Jul/15  Updated: 29/Oct/15  Resolved: 29/Oct/15

Status: Closed
Project: Core Server
Component/s: Index Maintenance
Affects Version/s: 3.0.4
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Igor Canadi Assignee: Sam Kleinman (Inactive)
Resolution: Incomplete Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-19365 Allow one extra connection from local... Closed
Operating System: ALL
Participants:

 Description   

We don't have much data on this one, or reproduction steps. However, we're looking for insight on where to look to debug. Maybe you also saw something similar before.

We had an issue with one of our primary nodes where a background index build caused mongod to run out of connections. In the log we saw lines like "refusing connection because we already have 20.000 connections". We couldn't even connect to mongod via a shell, which made it much harder to debug (feature request: make sure we can access mongod even when max number of connections is reached. maybe always allow an one extra connection from localhost?).

Here's an interesting part from the mongodb.log: https://gist.github.com/igorcanadi/9182e6f1a49af989642f

It looks like bunch of operations spent a lot of times waiting for an exclusive write lock. Is this expected?

Any insight would be helpful.



 Comments   
Comment by Ramon Fernandez Marina [ 29/Oct/15 ]

Thanks igor, closing ticket. If you see this again and are able to gather more data just ping us here and we'll reopen.

Comment by Igor Canadi [ 28/Oct/15 ]

Unfortunately (or, well, fortunately), this issue hasn't happened since. SERVER-19365 would still be good to address, but we can close this ticket I think.

Comment by Ramon Fernandez Marina [ 28/Oct/15 ]

Hi igor, do you have any more data that can help us diagnose this ticket?

Thanks,
Ramón.

Comment by Igor Canadi [ 10/Jul/15 ]

Thanks Sam. I created SERVER-19365 to track the extra connection.

We'll get back to you with more data soon.

Comment by Sam Kleinman (Inactive) [ 10/Jul/15 ]

We should track the request for the ability to open one more connection on localhost separately from this issue.

It doesn't look like the operations should back up on the index builds like this. A few questions:

  1. Do you have insight into what kind of operations are waiting? Are they operations that you can time out more aggressively?
  2. Are you building TTL indexes in the background? TTL deletions will begin before the background index build completes, and there may be some interaction there.
  3. Is this happening consistently for these kinds of index builds, or is there something special about the index builds or MongoDB instances where these errors are happening?

Sorry for the frustration, please keep us posted if you find more information or a possible reproduction.

Regards,
sam

Generated at Thu Feb 08 03:50:42 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.