[SERVER-8189] Unstable connectivity in replica sets environment Created: 16/Jan/13 Updated: 15/Feb/13 Resolved: 22/Jan/13 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | 2.2.2 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Gusev Petr | Assignee: | Tad Marshall |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Debian: Linux elba-mongo3 2.6.32-5-amd64 #1 SMP Sun May 6 04:00:17 UTC 2012 x86_64 GNU/Linux |
||
| Attachments: |
|
||||||||
| Issue Links: |
|
||||||||
| Operating System: | Linux | ||||||||
| Participants: | |||||||||
| Description |
|
Hello! We have production setup of three nodes in replica sets, elba-mongo1 on one rack (the primary) and elba-mongo2, elba-mongo3 on another (secondaries). Wed Jan 16 11:59:13 [rsBackgroundSync] replSet syncing to: elba-mongo1:27017 After this point there is an election of new primary. elba-mongo1 pinged successfully from both elba-mongo2 and 3, but we can't connect to elba-mongo1 instance from elba-moongo1 machine using mongo console - it just hangs on "connecting to": Croot@elba-mongo1:~# telnet localhost 27017 |
| Comments |
| Comment by Tad Marshall [ 22/Jan/13 ] |
|
Hi Dmitry, I had assumed that actually connecting from the shell in order to do something was a goal, but your timing approach is interesting as well. I'm glad that you're up-and-running now. I'll close this ticket, and thanks for letting us know the resolution! Tad |
| Comment by Tad Marshall [ 22/Jan/13 ] |
|
Hi Gusev, Thanks for letting us know what the problem was. I'm sorry that I didn't latch onto NUMA myself, since it is right there in the log you posted. Since you noted connectivity problems and two primaries, I didn't think to look for alternate explanations. Version 2.4 will display any startup warnings from the server when connecting to it from the mongo shell, which will help to raise the visibility of issues such as NUMA. A proposal I made in Tad |
| Comment by Dmitry Karmazin [ 22/Jan/13 ] |
|
Oh, yeah.. That was a typo in command. My mistake... Speaking about omitting "shell" option - I just tried to measure connect time running three commands in a row (one for each node). I thought leaving client connected or not is not so important, isn't it? Nevertheless, we've installed numactl last night. Mongod starts without warnings now and works without any issues. |
| Comment by Gusev Petr [ 22/Jan/13 ] |
|
As it turns out the root of the problem was NUMA. We didn't pay much attention to it initially because the symptoms look differently from what have beed described here http://www.wireclub.com/development/TqnkQwQ8CxUYTVT90/read. But this article |
| Comment by Tad Marshall [ 21/Jan/13 ] |
|
If this is how you spelled "prompt" (sic. "propmt") then you just set a new variable and didn't change the actual prompt. Changing the prompt to a fixed string is intended to sidestep any issues related to prompt negotiation. Setting the "prompt" variable will accomplish this. Setting "propmt" just sets a variable; not one that means anything to the shell. You omitted the "- If you have a working connection from elba-mongo2 to elba-mongo1, you should be able to use serverStatus and currentOp commands to see what state it is in. Are your ulimit settings sufficient to allow the number of connections you are trying to have? |
| Comment by Dmitry Karmazin [ 21/Jan/13 ] |
|
Hi Tad! I'm another Petr's co-worker Mongod continues to accept new connections both from local and remote nodes (proven by telnet), no packet loss while nodes pinging each other but root@elba-mongo1:~# time mongo --eval "propmt='>'" --host elba-mongo1 real 1m4.098s but it can connect to second node from same host: root@elba-mongo1:~# time mongo --eval "propmt='>'" --host elba-mongo2 real 0m0.028s At the same time, if I leave mongo client at elba-mongo2 connected to elba-mongo1, it can communicate with server, while new clients from other nodes (including localhost) can't connect. Here's rs.status() from elba-mongo1 at that moment: rs0:PRIMARY> rs.status() , , { "_id" : 2, "name" : "elba-mongo3:27017", "health" : 1, "state" : 1, "stateStr" : "PRIMARY", "uptime" : 9145, "optime" : Timestamp(1358762308000, 7), "optimeDate" : ISODate("2013-01-21T09:58:28Z"), "lastHeartbeat" : ISODate("2013-01-21T09:58:29Z"), "pingMs" : 0 } ], I noticed, that mongod process is often in uninterruptable sleep state: I'm attaching strace output for mongod from elba-mongo1. It was started slightly before last heartbeat from elba-mongo1 received by elba-mongo2 and lasts until mongo client connets to elba-mongo1. I'm also seeing from logs, that mongod is running with NUMA enabled. Can it be casue for this issue? |
| Comment by Tad Marshall [ 17/Jan/13 ] |
|
Hi Renat, The cases where a replica set has two primaries are caused by a network partition, particularly an asymmetrical partition where node 1 can see the node 2 but node 2 cannot see node 1. As you saw, these conditions are temporary, because eventually the original primary will see that there is no longer a majority and will step down. It may require more exhaustive tests of your network than you have performed so far to narrow down the problem. Can every node ping every other node continuously while you are seeing these problems? Are the pings equally reliable with large packets? We have been able to reproduce issues similar to yours by simulating delays on the network that exceed the heartbeat frequency and we are looking into ways to increase reliability in the presence of network problems. But we are unable to see these issues on a fully functional network. Your logs show replication cycles that begin, proceed for a while and then fail with socket errors, and do this over and over again. It seems that something is wrong at the network layer; this could be a configuration problem or an intermittent segment, router or switch, a network card or interface that sometimes corrupts data or almost anything that prevents packets from being sent and received error-free between all nodes in a timely manner. Tad |
| Comment by Renat Khayretdinov [ 17/Jan/13 ] |
|
(I'm Petr's co-worker) Today we observed another issue. We had elba-mongo1 as PRIMARY and elba-mongo2 and elba-mongo3 as SECONDARY. Suddenly second and third stopped to see first instance (i.e. rs.status() outputs that it is not reachable/healthy). But first continued to send and receive heartbeats to second and third and preserved its PRIMARY state! After a while second became primary and replica set began to accept client requests. This situation (elba-mongo1 is seeing others but others isn't see it and there is two primaries) continued for several minutes. There were some another clutter with elections which you can see on logs (mongo1.log, etc.). It continued until I restared mongo instance on elba-mongo1. That is it is unlikely that it is network problem. Probably it is issue of listener of heartbeats. |
| Comment by Gusev Petr [ 16/Jan/13 ] |
|
Hi Tad 1)I will try your suggestion about mongo shell next time we experience the problem. Usually this happen under daily load. |
| Comment by Tad Marshall [ 16/Jan/13 ] |
|
Hi Gusev, You log file shows a number of rollback attempts, some of which succeed but most of which fail. It looks like there were network problems during this period; several like "exception: 10278 dbclient error communicating with server: elba-mongo2:27017" and like "replSet info elba-mongo2:27017 is down (or slow to respond): socket exception [SEND_ERROR] for 192.168.116.150:27017". The mongo shell attempts some communication with the server to determine the prompt to display (e.g. "PRIMARY> " or "SECONDARY> ") and this may be where the shell is appearing to hang. Can you try running the shell with a command line to disable this prompt calculation and see if this enables you to connect? Try this command at the bash prompt:
This will force the prompt to be ">>> " (you can substitute anything you want), which will enable you to see if you can connect to the server even if the prompt negotiation is not working for some reason due to the state of the server. (The "--eval" part passes a JavaScript command to the shell, "prompt='text' " sets the "prompt" variable and "--shell" tells the shell not to exit after executing the JavaScript). Tad |