[SERVER-60495] Retry FailedToSatisfyReadPreference in DDL coordinators Created: 06/Oct/21 Updated: 29/Oct/23 Resolved: 11/Oct/21 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | 5.0.3, 5.1.0-rc0 |
| Fix Version/s: | 5.2.0, 5.0.4, 5.1.0-rc1 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Pierlauro Sciarelli | Assignee: | Pierlauro Sciarelli |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||
| Operating System: | ALL | ||||||||||||||||||||
| Backport Requested: |
v5.1, v5.0
|
||||||||||||||||||||
| Participants: | |||||||||||||||||||||
| Linked BF Score: | 144 | ||||||||||||||||||||
| Description |
|
Some tests running on suites killing primary nodes resulted in DDL failures because of a FailedToSatisfyReadPreference exception. Such error must be included in the ones retriable in order not to release DDL coordinators half-way. |
| Comments |
| Comment by Githook User [ 12/Oct/21 ] |
|
Author: {'name': 'Pierlauro Sciarelli', 'email': 'pierlauro.sciarelli@mongodb.com', 'username': 'pierlauro'}Message: |
| Comment by Githook User [ 11/Oct/21 ] |
|
Author: {'name': 'Pierlauro Sciarelli', 'email': 'pierlauro.sciarelli@mongodb.com', 'username': 'pierlauro'}Message: |
| Comment by Githook User [ 08/Oct/21 ] |
|
Author: {'name': 'Pierlauro Sciarelli', 'email': 'pierlauro.sciarelli@mongodb.com', 'username': 'pierlauro'}Message: |
| Comment by Max Hirschhorn [ 06/Oct/21 ] |
|
pierlauro.sciarelli, after discussing SERVER-59884 with Randolph, I think it is currently a bug that FailedToSatisfyReadPreference isn't considered a retryable error by drivers for the commitTransaction command. (The UnknownTransactionCommitResult error label is set by drivers - for example in pymongo.) I'm wondering if it is also a server bug that FailedToSatisfyReadPreference isn't considered a retryable error by drivers for retryable writes. I suspect that it is a bug because there's a fair chance the retry would succeed. Until DRIVERS-555 is completed, drivers will still unfortunately only retry once so if the primary of the replica set shard is unavailable for 30 seconds (twice kDefaultFindHostTimeout), then the retryable write will still fail and lead to an error/exception in the application. It would be worth discussing/confirming with folks from the Replication and Drivers team but I feel like FailedToSatisfyReadPreference should have the same categories applied to it that ExceededTimeLimit currently has. ( |