[SERVER-12041] retry logic for read preferences should also apply on lazy recv() network failure Created: 11/Dec/13  Updated: 11/Jul/16  Resolved: 16/Dec/13

Status: Closed
Project: Core Server
Component/s: Internal Client, Networking
Affects Version/s: None
Fix Version/s: 2.4.9, 2.5.5

Type: Bug Priority: Major - P3
Reporter: Greg Studer Assignee: Greg Studer
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-11971 slaveok versioning logic in mongos sh... Closed
Operating System: ALL
Steps To Reproduce:

If data is sent over a connection to a replica set member who previously closed the return connection (without being tested first by the connection pool), retries will only occur if slaveOk is set, read prefs are ignored.

Participants:

 Description   
Issue Status as of January 8th, 2014

ISSUE SUMMARY
New sharded connections may fail to connect if any shard primary is down.

This issue is part of 4 related issues which impact cluster availability when there is no primary available for a shard. See SERVER-7246, SERVER-5625, SERVER-11971 and SERVER-12041 for more details.

USER IMPACT
When any primary member of a replica set in a sharded cluster is down, new connections may fail to perform secondary reads due to an initial heuristic shard version check, or initial authorization check.

It is present in versions of MongoDB prior to and including v2.4.8.

SOLUTION
Ignore failures of initial version check during connection and allow authorization against secondaries (primary is preferred when available).

In v2.4.9 only (this is set by default in v2.6.0 and later), it is necessary to use the following two startup parameters for mongos:

--setParameter ignoreInitialVersionFailure=true
--setParameter authOnPrimaryOnly=false

These parameters can also be set on a MongoS after launch with the following commands

db.adminCommand({setParameter:1,ignoreInitialVersionFailure:true})
db.adminCommand({setParameter:1,authOnPrimaryOnly:false})

WORKAROUNDS
There is no workaround.

PATCHES
Production release v2.4.9 contains the fix for this issue, and production release v2.6.0 will contain the fix as well.

Original Description

Currently network-failure-retries-on-recv() only occur when the slaveOk flag is explicitly set. This is difficult to trigger but can cause spurious errors to propagate back up to the caller, since the connection pool itself tries to clear bad connections and most failures are detected when say() fails.

Workaround is to set slaveOk flag for non-primary read preference.



 Comments   
Comment by Githook User [ 20/Dec/13 ]

Author:

{u'username': u'gregstuder', u'name': u'Greg Studer', u'email': u'greg@10gen.com'}

Message: SERVER-12041 improve retry logic for read preferences without slaveOk set
Branch: v2.4
https://github.com/mongodb/mongo/commit/e5dd557257e21ab8f2f19d2f7c557357f982c7f0

Comment by Githook User [ 11/Dec/13 ]

Author:

{u'username': u'gregstuder', u'name': u'Greg Studer', u'email': u'greg@10gen.com'}

Message: SERVER-12041 improve retry logic for read preferences without slaveOk set
Branch: master
https://github.com/mongodb/mongo/commit/be7c5f961bd163cb315c561ebd5a47a3b54dcfe8

Generated at Thu Feb 08 03:27:26 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.