Loading...

XML

Word

Printable

JSON

Type: New Feature
Resolution: Works as Designed
Priority: Major - P3
Fix Version/s: None
Affects Version/s: None
Component/s: Replication
Labels:
None

Confidence Status:
None
Work Order:
3
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

Hi,

Problem:

We just did a rolling upgrade from 3.0.14 to 3.4.3 and we noticed some strange behavior while doing the initial sync. All writes take 5 seconds to complete, making the app very slow and leading to timeouts and finally exhausted connection pools.

Our set-up:

The app is a Play-Scala application using the ReactiveMongo driver (0.12.2). We use w: majority and wtimeout: 5000 as our write concern.
The DB is a replica set of three data bearing nodes, with 2 arbiters.
Everything is hosted on AWS EC2, using EBS volumes.

It's a pretty big database, the initial sync takes around 1,5 hours, incl. index creation.

Steps

We add the new instances to the replica set (on top of the three other nodes), and they start to do an initial sync (from scratch, so not from an AWS snapshot).
Once the new node is in the cluster, all writes start to take exactly 5 seconds (the wtimeout value). We don't see much load on the app or on the primary though.
After digging in to the problem, we reconfigured the new nodes with priority:0 and votes:0 and from that moment onwards the whole app is working properly again.

So we have a kind of workaround, but having to use this workaround sounds undesirable. It also looks a bit weird to me, since the rest of the cluster is quick, there is no need to wait for the new instances the get a majority at all, right? 3 nodes out of 5 respond quick, so that should be enough to return.

Also, during the initial sync, I don't see the need for writes to be sent to the node, since it's not caught up by far and afterwards it will replay these writes anyway.

Is it possible to ignore nodes that are in STARTUP2 when replicating writes? Or maybe another solution might be to quickly return a special acknowledgement when the node is in STARTUP2, so the primary can return the write back to the application?

Assignee:: Andy Schwerin
Reporter:: Jan-Kees van Andel
Participants:: Andy Schwerin, Jan-Kees van Andel
Votes:: 0 Vote for this issue
Watchers:: 7 Start watching this issue

Created:: May 18 2017 08:07:39 AM UTC
Updated:: Oct 27 2023 01:54:27 PM UTC
Resolved:: May 18 2017 01:38:06 PM UTC

Details

Description

Problem:

Our set-up:

Steps

Attachments

Activity

People

Dates