[SERVER-22620] Improve mongos handling of a very stale secondary config server Created: 16/Feb/16  Updated: 03/Jan/18  Resolved: 09/Aug/16

Status: Closed
Project: Core Server
Component/s: Replication, Sharding
Affects Version/s: 3.2.1
Fix Version/s: 3.3.11

Type: Bug Priority: Major - P3
Reporter: Dmitry Ryabtsev Assignee: Misha Tyulenev
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
is duplicated by SERVER-24678 Allow select a CSRS node with the sma... Closed
Related
is related to SERVER-22627 ShardRegistry should mark hosts which... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Steps To Reproduce:
  1. Create a sharded cluster (1 shard is enough)
  2. Make sure you have a replica set of config servers with 3 nodes
  3. Shard a collection
  4. Connect to one of the config servers secondary and lock it with the "db.fsyncLock()" command
  5. Try to shard something else and watch how the mongos times out. For example:

    mongos> sh.shardCollection("test.testcol2", {a:1})
    { "ok" : 0, "errmsg" : "Operation timed out", "code" : 50 }
    

    Balancer is failing to work:

    2016-02-16T02:30:35.208+0000 I SHARDING [Balancer] about to log metadata event into actionlog: { _id: "dmnx-6-2016-02-16T02:30:35.208+0000-56c289cbaee7ebe12efbfaee", server: "dmnx-6", clientAddr: "", time: new Date(1455589835208), what: "balancer.round", ns: "", details: { executionTimeMillis: 30016, errorOccured: false, candidateChunks: 0, chunksMoved: 0 } }
    2016-02-16T02:31:15.244+0000 W SHARDING [Balancer] ExceededTimeLimit Operation timed out
    

Sprint: Sharding 18 (08/05/16), Sharding 2016-08-29
Participants:
Case:

 Description   

This ticket is to improve the sharding handling of very stale secondary config servers (although it would apply to shards as well). The proposed solution is for the isMaster response to include the latest optime it has replicated to, so that the replica set monitor, in addition to selecting 'nearer' hosts will also prefer those with most recent optimes.

This problem is also present in the case of fsyncLocked secondaries. It seems that mongos is unable to work properly if one of the config servers (RS) secondaries is locked with db.fsyncLock(). I have tried running some write concern / read concern operations directly on the replica set while a secondary is locked that way and found no problem. Thus it must be the problem of the mongos alone.



 Comments   
Comment by Ramon Fernandez Marina [ 14/Apr/17 ]

venkata.surapaneni@elastica.co, this ticket has not been considered for backporting to v3.2. If this is an issue for you I'd suggest you consider an upgrade to MongoDB 3.4, which does contain a fix for this problem.

Comment by VenkataRamaRao Surapaneni [ 13/Apr/17 ]

Is this issue fixed in 3.2.12 release?

Comment by Pooja Gupta (Inactive) [ 03/Apr/17 ]

misha.tyulenev I believe this fix has been included in MongoDB 3.4. Is it backported to version 3.2 as well?

Comment by Githook User [ 09/Aug/16 ]

Author:

{u'username': u'mikety', u'name': u'Misha Tyulenev', u'email': u'misha@mongodb.com'}

Message: SERVER-22620 prefer config servers with recent opTime
Branch: master
https://github.com/mongodb/mongo/commit/4c6009e67d3e503f796b5afcbcbeaa95eba80b44

Comment by Kaloian Manassiev [ 18/Feb/16 ]

I am re-purposing this ticket to cover our handling of very stale secondary config servers (although it would apply to shards as well). The proposed solution is for the isMaster response to include the latest optime it has replicated to, so that the replica set monitor, in addition to selecting 'nearer' hosts will also prefer those with most recent optimes.

Comment by Kaloian Manassiev [ 16/Feb/16 ]

At the very least we should make the ShardRegistry mark hosts where the operations timeout as faulty. This will ensure that on a second attempt of the operation, the fsync locked host will not be contacted again.

However, what Matt suggests is a better solution, even though it's more involved. We can return FailedToSatisfyReadPreference if the read concern is not satisfied, just before we begin waiting in replication_coordinator_impl.cpp.

Comment by Matt Dannenberg [ 16/Feb/16 ]

I believe what is happening here is the mongos is contacting the locked secondary and that node is unable to satisfy the read concern and respond. Ideally the mongos (or another driver) would know the secondary is fsync-locked and not attempt to contact it. Maybe return an error (NotMasterOrSecondary? a new fsync one?) if queried with readConcern: majority while fsync-locked. Maybe fall into a new replSet state indicating the node is fsync-locked and should not be contacted in the first place.

Generated at Thu Feb 08 04:00:57 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.