[SERVER-44352] What will be the behavior of mongos under this partial network partition Created: 01/Nov/19  Updated: 08/Nov/19  Resolved: 08/Nov/19

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Question Priority: Major - P3
Reporter: Atwood Wang Assignee: Carl Champain (Inactive)
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Participants:

 Description   

Consider a shard cluster:

  1. Has only one shard, but the shard has 3 replicas A, B, C
  2. Has 3 mongos instances
  3. mongos and mongod instance are running across 3 different data centers.
  4. The primary node (A) of the replica set is in Data center 1

 

It will look like this:

Data Center 1          |       Data Center 2             |     Data Center 3

mongos  1                |       mongos 2                    |     mongos 3

Primary node (A)    |       secondary node (B)   |     secondary node (C)

 

Now, suppose there is a network partition between Data center 1 and 2, A and B cannot see each other, but C can see both. In this case, if I understand correctly, B will try to elect itself to be the primary, but it will fail because A can still see the majority of the replica sets (A and C), and C can still see A.

In this case, if a client talks to the mongos 2 which is running in DataCenter 2, will all the read and write operation for this client fail because the network between Data Center 1 and Data Center 2 is broken, so it cannot reach the primary node?

 



 Comments   
Comment by Carl Champain (Inactive) [ 08/Nov/19 ]

Hi wangzicong@bytedance.com,

 

In this case, if I understand correctly, B will try to elect itself to be the primary, but it will fail because A can still see the majority of the replica sets (A and C), and C can still see A.

In summary, B will run for election and will fail. A couple of reasons why:
1. A won't respond
2. C will notice that B doesn't have the latest applied ops from A, so C will vote no for B

 

In my previous understanding, I thought under a network partition case when A is the primary, B and C are secondaries, and the network between A and B is broken. I thought B will start an election but C will reject it since it can see A is an active primary. However, after seeing these lines of code, seems like C will reject the request only if it is an arbiter. If C is an normal data-bearing node, it will approve B's request.

To help you understand a bit more, please take a look at this else if block of code which checks if B has staler data than C, this is the same section you pointed out but only a few lines up. B has stale data since it can't replicate the latest writes on A. Consequently, B has only one yes (itself) and loses.

 

if a client talks to the mongos 2 which is running in DataCenter 2, will all the read and write operation for this client fail

If a client talks to the mongos 2, then all writes should fail. However, if you add all mongos to the connection string at the application level, and if you add retry logic when they the client gets a no primaries found error, then the application can attempt to send the writes to another mongos.

 

Will this make B become a primary and result in C seeing two primaries, A and B?

So in this situation, there can't be two primaries. However, in a different situation where this happens, there is a mechanism which ensures that the the old primary steps down. The field term in rs.status() is a counter indicating how many elections have occurred. So when the old primary goes back online, it will see that its value in term is lower than the current value in the replica set and will step down.

 

That said, the SERVER project is for bugs and feature suggestions for the MongoDB server. As this ticket does not appear to be a bug, I will now close it. If you need further assistance troubleshooting, I encourage you to ask our community by posting on the mongodb-user group or on Stack Overflow with the mongodb tag.

Kind regards,
Carl

Comment by Atwood Wang [ 05/Nov/19 ]

I found something else that I don't understand.

In my previous understanding, I thought under a network partition case when A is the primary, B and C are secondaries, and the network between A and B is broken. I thought B will start an election but C will reject it since it can see A is an active primary. However, after seeing these lines of code, seems like C will reject the request only if it is an arbiter. If C is an normal data-bearing node, it will approve B's request. Will this make B become a primary and result in C seeing two primaries, A and B?

I am not sure if I am missing some key logic in this process, any help will be super appreciated.

Generated at Thu Feb 08 05:05:45 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.