[SERVER-9617] getLastError should have a clearer error message when primary steps down Created: 08/May/13  Updated: 31/Oct/14  Resolved: 01/Oct/14

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: None
Fix Version/s: 2.7.8

Type: Improvement Priority: Major - P3
Reporter: Eric Milkie Assignee: Spencer Brody (Inactive)
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-14601 All "not master" errors should use th... Closed
is related to SERVER-9417 opReplicatedEnough should assert on s... Closed
is related to SERVER-15228 Make awaitReplication fail with NotMa... Closed
Backwards Compatibility: Minor Change
Participants:

 Description   

If you are blocked in GLE waiting for write concern and the primary steps down, GLE should return a better error message describing what happened.



 Comments   
Comment by Githook User [ 01/Oct/14 ]

Author:

{u'username': u'stbrody', u'name': u'Spencer T Brody', u'email': u'spencer@mongodb.com'}

Message: SERVER-9617 Add missing space to message
Branch: master
https://github.com/mongodb/mongo/commit/78919da3438c114bf14d7a08f0710043910829be

Comment by Githook User [ 01/Oct/14 ]

Author:

{u'username': u'stbrody', u'name': u'Spencer T Brody', u'email': u'spencer@mongodb.com'}

Message: SERVER-9617 Clearer error message when stepping down interrupts awaitReplication
Branch: master
https://github.com/mongodb/mongo/commit/222a18ede0fdaf940fc4e018aeca126861d32ea1

Comment by A. Jesse Jiryu Davis [ 22/Sep/14 ]

Thanks!

Comment by Spencer Brody (Inactive) [ 22/Sep/14 ]

For 2.6+ I believe #1 would suffice, but in 2.4 there's at least one place (in getLastError waiting for write concern) that fails with a different code and the string "replicatedToNum called but not master anymore""

Comment by A. Jesse Jiryu Davis [ 22/Sep/14 ]

Do you have an opinion about which of these two client strategies is more likely correct for MongoDB >= 2.2?:

1. Search for error messages that begin with "not master" or have code 10107.

2. Search for error messages that contain "not master" or have code 10107.

I notice that the DBClient searches for strings containing not master:

https://github.com/mongodb/mongo/blob/4752092258fd5868147b8cece636d2fd41c79305/src/mongo/client/dbclient.cpp#L408

Comment by Spencer Brody (Inactive) [ 19/Sep/14 ]

No, it may not always begin with the string "not master". In 2.6 you can get "replicatedToNum called but not master anymore", though it should have code 10107 (NotMaster). So I think checking for the string "not master" or the code 10107 should be sufficient.

Comment by A. Jesse Jiryu Davis [ 19/Sep/14 ]

To verify the Server Discovery And Monitoring Spec we need to know that a client can detect when it's attempted to insert to a stepped-down primary. The spec assumes the GLE response message always begins with the string "not master". Is this true or are there exceptions to this rule?

Comment by Eric Milkie [ 19/Sep/14 ]

spencer can you check to see what errors we can possibly report via getLastError when we step down, with the current state of master branch?

Generated at Thu Feb 08 03:20:57 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.