[DOCS-581] Add description of how the drivers handle the failover of a primary Created: 05/Oct/12  Updated: 16/Mar/15  Resolved: 10/Feb/15

Status: Closed
Project: Documentation
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Minor - P4
Reporter: William Zola Assignee: Tyler Brock
Resolution: Won't Fix Votes: 2
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Participants:
Days since reply: 11 years, 19 weeks, 5 days ago

 Description   

The documentation doesn't cover what the driver or client program will see when there is a fail-over in a replica set.



 Comments   
Comment by Sam Kleinman (Inactive) [ 05/Oct/12 ]

This behavior is not specified in a formal state, and given the variability and the lack of intentionality around this behavior, I don't know that it makes sense to document this in the current plan.

This kind of documentation documentation should be published as part of the mongo-meta-driver project.

I'm assigning to Tyler Brock for holding at the moment.

Comment by William Zola [ 05/Oct/12 ]

Here is the behavior of the Java driver. Most drivers have similar behavior, but the details are slightly different.

1) The behavior of MongoDB is different if you are connected to a replica set or to a sharded cluster (that is, to a 'mongos')

2) Here is the behavior if you are connected to a replica set:

a) The Java driver keeps track of which node in the replica set is primary, and will direct all writes to that node.
b) When the primary node in a replica set fails or steps down, the socket connection between the server and the driver gets broken
c) If the client program tries to perform a write, the Java driver will to determine if the replica set has a primary. If the replica set has a primary at that time, the write will succeed. If there is no primary at that time, the Java driver will throw a MongoException.
d) Your client code should check for a MongoException whenever it performs an operation accessing MongoDB. You can examine the contents of the MongoException to determine the exact cause of the write failing.
e) Your client code will need to be written to handle the time when a replica set has no primary and cannot accept writes. Your choices are to either abort the operation, queue up the operation for later, or to re-try some reasonable amount of time later (say, 15 seconds or so). Which choice is correct for you depends on the details of your application.

3) Here is the behavior if you are connected to a 'mongos':

a) If you use "Safe Mode" writes, that is, with a WriteConcern of SAFE or better, then if your client code tries to write to a shard which has no primary node (and therefore cannot accept writes) then when the getLastError command returns, the latest version of the Java driver (2.9.x) will generate a MongoException, which you have to handle as you would when writing to a replica set.

Note that earlier versions of the Java driver had different behavior.

b) If you use the default write format (aka 'fire-and-forget' writes) then if your client code tries to write to a shard which has no primary node (and therefore cannot accept writes), then the write will fail, but 'mongos' will not return an error code to your program.

3) The content of the MongoException will indicate the exact error that caused the write to fail. Please note that the exact message you get when a primary is not available may be different when connected to a replica set and a sharded cluster.

Generated at Thu Feb 08 07:39:01 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.