Loading...

Type: Bug
Resolution: Duplicate
Priority: Major - P3
Fix Version/s: None
Affects Version/s: 3.6.1
Component/s: JavaScript, Replication
Labels:
None

Operating System:
ALL
Steps To Reproduce:

Hide

Please see the description and the doc here: https://github.com/rystsov/perseus/tree/master/mongodb

Show
Please see the description and the doc here: https://github.com/rystsov/perseus/tree/master/mongodb
Case:
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

Hi folks,

I was testing how a 3-nodes cluster of MongoDB behaves (from a client perspective) when one of the nodes is isolated from the cluster and observed an anomaly.

TL;DR: When a node is isolated, all clients hang for almost two minutes (unexpected behaviour, too long downtime). A client running on the isolated node continues being down (expected behaviour), but the clients on healthy nodes become unexpectedly slow (RPS drop to less than one operation per second from 282 before the isolation).

Repo with docker+javascript scripts to reproduce the problem: https://github.com/rystsov/perseus/tree/master/mongodb

Testing environment

OS: Ubuntu 17.10
MongoDB: 3.6.1
NodeJS: v8.9.4
MongoDB driver: NPM: "mongodb": 3.0.1

Three nodes (mongo1, mongo2, mongo3) form a MongoDB cluster running mongod in a single replica set. Besides mongod, each node has a NodeJS app which:

in a loop reads a value from the cluster, increments it and writes back
exposes API to fetch number of successful and failed iterations

Also there is a fourth node (client1) which fetches the stat and dumps it every second in the following format:

#legend: time|mongo1|mongo2|mongo3|mongo1:err|mongo2:err|mongo3:err
1	9	21	24	0	0	0	2018/01/14 10:43:02
2	28	50	52	0	0	0	2018/01/14 10:43:03
3	28	48	43	0	0	0	2018/01/14 10:43:04

The first column is the number of passed seconds since the beginning of the experiment; the next three columns contain counter of successful iterations of the clients located on each node per last second, following three columns are counters of errors and the last one is time.

Client (db/remote-tester)

Client is a NodeJS app using the following connection URL: mongodb://mongo1,mongo2,mongo3/?replicaSet=rs0&autoReconnect=false&socketTimeoutMS=10000&connectTimeoutMS=10000.

Each write operation has writeConcern "majority" while each read operation has readConcern "linearizable".

When a client runs into a problem it closes the current connection and opens a new one.

Each client uses a different key to avoid collisions.

Isolation

The network isolation is simulated with iptables:

iptables -A INPUT -s p.ee.r.ip -j DROP
iptables -A OUTPUT -d p.ee.r.ip -j DROP

During the isolation, the traffic is blocked between mongo1 and (mongo2 & mongo3), but client1 can see all nodes.

Results

Let's take a look on the logs of the aggregating client (client1.log):

#legend: time|mongo1|mongo2|mongo3|mongo1:err|mongo2:err|mongo3:err
1	9	21	24	0	0	0	2018/01/14 10:43:02
2	28	50	52	0	0	0	2018/01/14 10:43:03
3	28	48	43	0	0	0	2018/01/14 10:43:04
...
10	25	55	53	0	0	0	2018/01/14 10:43:11
11	14	55	60	0	0	0	2018/01/14 10:43:12
# isolating mongo1
# isolated mongo1
12	31	47	54	0	0	0	2018/01/14 10:43:13
13	8	4	3	0	0	0	2018/01/14 10:43:14
14	0	0	0	0	0	0	2018/01/14 10:43:15
...
128	0	0	0	0	0	0	2018/01/14 10:45:09
129	0	0	0	0	0	0	2018/01/14 10:45:10
130	0	0	0	0	0	0	2018/01/14 10:45:11
131	0	1	1	0	0	0	2018/01/14 10:45:12
132	0	0	0	0	0	0	2018/01/14 10:45:13
133	0	0	0	0	0	0	2018/01/14 10:45:14
134	0	0	0	0	0	0	2018/01/14 10:45:15
135	0	1	1	0	0	0	2018/01/14 10:45:16
...
192	0	0	0	0	0	0	2018/01/14 10:46:14
193	0	0	0	0	0	0	2018/01/14 10:46:15
194	0	1	1	0	0	0	2018/01/14 10:46:16
195	0	0	0	0	0	0	2018/01/14 10:46:17
# rejoining mongo1
196	0	0	0	0	0	0	2018/01/14 10:46:18
# rejoined mongo1
197	0	0	0	0	0	0	2018/01/14 10:46:19
198	0	1	1	0	0	0	2018/01/14 10:46:20
199	0	0	0	0	0	0	2018/01/14 10:46:21
200	0	0	0	0	0	0	2018/01/14 10:46:22
201	0	0	0	0	0	0	2018/01/14 10:46:23
202	0	1	1	0	0	0	2018/01/14 10:46:24
...
230	0	1	1	0	0	0	2018/01/14 10:46:52
231	0	0	0	0	0	0	2018/01/14 10:46:53
232	0	1	1	0	0	0	2018/01/14 10:46:54
233	0	1	1	0	0	0	2018/01/14 10:46:55
234	0	0	0	0	0	0	2018/01/14 10:46:56
235	0	52	39	0	0	0	2018/01/14 10:46:57
236	0	48	42	0	0	0	2018/01/14 10:46:58
...
305	0	57	43	0	0	0	2018/01/14 10:48:07
306	0	56	39	0	0	0	2018/01/14 10:48:08

As you can see, clients on mongo1, mongo2, mongo3 stopped doing progress after the isolation. 117 seconds later mongo2 and mongo3 "recovered" but they became very slow.

When the connection was restored it took 38s until mongo2, mongo3 recovered to their previous pace, but mongo1 remained silent (unexpected behaviour).

Logs

Please see the attached "slow-full.tar.gz" archive for logs.

client1.log

Log of the aggregating client. Its format was described above.

mongo(1|2|3).tester.log

Log of the actual client. In happy case it looks like:

2018/01/14 10:43:01	reading(0)
	2018/01/14 10:43:01	connecting
	2018/01/14 10:43:01	ok
2018/01/14 10:43:01	creating(0)
2018/01/14 10:43:01	ok(0)
2018/01/14 10:43:01	reading(1)
2018/01/14 10:43:01	stat
2018/01/14 10:43:01	updating(1)
2018/01/14 10:43:01	ok(1)

Each successful iteration is represented by a sequence of lines (reading, creating|updating, ok). During a connection you'll see "connecting" and "ok" records. "stat" record corresponds to the fetch request from the aggregating client and helps to understand how long the previous operation hangs.

If there are errors during the execution, then they also are dumped into this log like:

"MongoNetworkError: connection 0 to mongo1:27017 timed out"
"MongoNetworkError: failed to connect to server mongo3:27017 on first connect"

mongo(1|2|3).log

Log of the mongod processes.

Why do I think it's more a bug rather than a misconfiguration?

The bug isn't always reproducible: sometimes the pace of mongo2 and mongo3 doesn't degrade and when the connection is restored the mongo1's client starts working. See the attached "good-full.tar.gz".

I hope this report will help you!

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

slow-full.tar.gz
109 kB
Jan 15 2018 09:51:56 AM UTC
good-full.tar.gz
105 kB
Jan 15 2018 09:52:02 AM UTC
slow-full.data.tar.gz
14.90 MB
Jan 16 2018 12:45:14 AM UTC
good-full.data.tar.gz
12.63 MB
Jan 16 2018 01:03:30 AM UTC

is duplicated by

SERVER-35200 Speed up failure detection in the OplogFetcher during steady state replication

Closed

is related to

NODE-1363 Unable to establish a connection to a MongoDB cluster after a connection between Primary and its peers was disrupted and then fixed

Closed

SERVER-35200 Speed up failure detection in the OplogFetcher during steady state replication

Closed

related to

SERVER-33026 SyncSourceFeedback request doesn't have a timeout

Closed

SERVER-41187 Majority committed replication lag spikes after an election

Closed

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates

PagerDuty