[SERVER-54900] Blocking networking calls can delay sync-source resolution indefinitely Created: 02/Mar/21 Updated: 29/Oct/23 Resolved: 21/Dec/21 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | 5.3.0, 4.4.15, 5.0.15 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | George Wangensteen | Assignee: | Amirsaman Memaripour |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | servicearch-q4-2021, servicearch-wfbf-day | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||||||||||||||
| Operating System: | ALL | ||||||||||||||||||||||||||||||||
| Backport Requested: |
v5.0, v4.4
|
||||||||||||||||||||||||||||||||
| Steps To Reproduce: | [On v4.4.1] Just starting 3 mongod's on localhost with different ports, configuring them into a replica set with chaining disabled, and sending SIGSTOP to the primary reproduces this issue (specifically, the second case) frequently on my machine (the unstopped node that remains as a secondary continues to list and use the "dead" node as it's sync source until it rejoins the set). |
||||||||||||||||||||||||||||||||
| Sprint: | Service Arch 2021-11-15, Service Arch 2021-11-22, Service Arch 2021-12-13, Service Arch 2022-1-10 | ||||||||||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||||||||||
| Case: | (copied to CRM) | ||||||||||||||||||||||||||||||||
| Linked BF Score: | 19 | ||||||||||||||||||||||||||||||||
| Story Points: | 5 | ||||||||||||||||||||||||||||||||
| Description |
|
When the oplog fetcher cannot retrieve a batch of oplog documents from it's sync source, it will first try and create a new cursor on that sync source before choosing a new one (see the code here: https://github.com/mongodb/mongo/blob/ad91a93a5a31e175f5cbf8c69561e788bbc55ce1/src/mongo/db/repl/oplog_fetcher.cpp#L493) In doing so, it will first attempt to re-establish the connection to the sync source using it's existing DBClientConnection. But in certain edge-cases, this connection-reestablishment process can make synchronous/blocking networking calls without correctly configured timeouts. In cases where the remote host the networking call is trying to communicate with is completely unresponsive, this can block the oplog fetching thread indefinitely, preventing the selection of a new sync-source. In this case, it is possible for a node to continue to use an unavailable node as its "sync-source" until the unavailable node re-contacts the set, allowing the blocking networking call to finish and the oplog fetcher to shut down.
I found two cases so far where these calls seem to be made: In the first case, we perform a blocking SSL handshake without a timeout configured on the socket; if the remote node becomes unavailable between establishing the connection and the SSL handshake, we'll see this issue. In DBClientConnection::_checkConnection(), between the "Trying to reconnect" log line and the "Reconnect attempt failed" log line, the thread attempting to re-establish connection to the sync source (node 01) calls DBClientConnection::connect() (see the source code here). A second similar case I've been able to reproduce locally: When the oplog fetcher creates a new cursor here: https://github.com/mongodb/mongo/blob/ad91a93a5a31e175f5cbf8c69561e788bbc55ce1/src/mongo/db/repl/oplog_fetcher.cpp#L647 , the old one falls out of scope and is destroyed. The call stack below ensues (I've abbreviated parts):
In this code, the ScopedDbConnection constructor here: https://github.com/mongodb/mongo/blob/ad91a93a5a31e175f5cbf8c69561e788bbc55ce1/src/mongo/client/connpool.h#L471 constructs a DbClientConnection with a socket timeout of 0, which translates here: https://github.com/mongodb/mongo/blob/ad91a93a5a31e175f5cbf8c69561e788bbc55ce1/src/mongo/client/dbclient_connection.cpp#L610-L611 into the DBClientConnection storing the value for the socket timeout as boost::none. When it sets that timeout on the ASIOSession here: https://github.com/mongodb/mongo/blob/ad91a93a5a31e175f5cbf8c69561e788bbc55ce1/src/mongo/transport/session_asio.h#L204, the ASIOSession stores boost::none as the timeout. This means that when ASIOSession::sourceMessage is called, no timeout is set on the socket before the read is performed (see https://github.com/mongodb/mongo/blob/ad91a93a5a31e175f5cbf8c69561e788bbc55ce1/src/mongo/transport/session_asio.h#L302). This allows the read to block indefinitely until the node comes back online. I've repro'd the issue on 4.4.1, but we should also audit the main branch for this issue – it doesn't look like the relevant code has changed.
Next steps: Write a test to reproduce the issue described in this ticket Determine the level at which level to solve this issue
|
| Comments |
| Comment by Jack Wearden [ 25/Jan/23 ] |
|
Thanks for adding the backport requests. Is it too late to also request backport to 4.2? |
| Comment by Githook User [ 05/Dec/22 ] |
|
Author: {'name': 'Amirsaman Memaripour', 'email': 'amirsaman.memaripour@mongodb.com', 'username': 'samanca'}Message: (cherry picked from commit 263c0631c8001a8cdb42aff720b8a49a621754dd) |
| Comment by Adam Rayner [ 16/Sep/22 ] |
|
Requesting backport to 5.0 - this was backported to 4.4 and is present in 5.3+, so should probably also be backported to 5.0. |
| Comment by Githook User [ 08/Jun/22 ] |
|
Author: {'name': 'Amirsaman Memaripour', 'email': 'amirsaman.memaripour@mongodb.com', 'username': 'samanca'}Message: (cherry picked from commit 263c0631c8001a8cdb42aff720b8a49a621754dd) |
| Comment by Githook User [ 21/Dec/21 ] |
|
Author: {'name': 'Amirsaman Memaripour', 'email': 'amirsaman.memaripour@mongodb.com', 'username': 'samanca'}Message: |