[SERVER-44352] What will be the behavior of mongos under this partial network partition Created: 01/Nov/19 Updated: 08/Nov/19 Resolved: 08/Nov/19 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Question | Priority: | Major - P3 |
| Reporter: | Atwood Wang | Assignee: | Carl Champain (Inactive) |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Participants: |
| Description |
|
Consider a shard cluster:
It will look like this: Data Center 1 | Data Center 2 | Data Center 3 mongos 1 | mongos 2 | mongos 3 Primary node (A) | secondary node (B) | secondary node (C)
Now, suppose there is a network partition between Data center 1 and 2, A and B cannot see each other, but C can see both. In this case, if I understand correctly, B will try to elect itself to be the primary, but it will fail because A can still see the majority of the replica sets (A and C), and C can still see A. In this case, if a client talks to the mongos 2 which is running in DataCenter 2, will all the read and write operation for this client fail because the network between Data Center 1 and Data Center 2 is broken, so it cannot reach the primary node?
|
| Comments |
| Comment by Carl Champain (Inactive) [ 08/Nov/19 ] |
|
In summary, B will run for election and will fail. A couple of reasons why:
To help you understand a bit more, please take a look at this else if block of code which checks if B has staler data than C, this is the same section you pointed out but only a few lines up. B has stale data since it can't replicate the latest writes on A. Consequently, B has only one yes (itself) and loses.
If a client talks to the mongos 2, then all writes should fail. However, if you add all mongos to the connection string at the application level, and if you add retry logic when they the client gets a no primaries found error, then the application can attempt to send the writes to another mongos.
So in this situation, there can't be two primaries. However, in a different situation where this happens, there is a mechanism which ensures that the the old primary steps down. The field term in rs.status() is a counter indicating how many elections have occurred. So when the old primary goes back online, it will see that its value in term is lower than the current value in the replica set and will step down.
That said, the SERVER project is for bugs and feature suggestions for the MongoDB server. As this ticket does not appear to be a bug, I will now close it. If you need further assistance troubleshooting, I encourage you to ask our community by posting on the mongodb-user group or on Stack Overflow with the mongodb tag. Kind regards, |
| Comment by Atwood Wang [ 05/Nov/19 ] |
|
I found something else that I don't understand. In my previous understanding, I thought under a network partition case when A is the primary, B and C are secondaries, and the network between A and B is broken. I thought B will start an election but C will reject it since it can see A is an active primary. However, after seeing these lines of code, seems like C will reject the request only if it is an arbiter. If C is an normal data-bearing node, it will approve B's request. Will this make B become a primary and result in C seeing two primaries, A and B? I am not sure if I am missing some key logic in this process, any help will be super appreciated. |