[SERVER-50892] Mirror reads logic incorrectly assumes response is always OK Created: 11/Sep/20  Updated: 29/Oct/23  Resolved: 25/Sep/20

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 4.8.0

Type: Bug Priority: Major - P3
Reporter: Spencer Brody (Inactive) Assignee: Benjamin Caimano (Inactive)
Resolution: Fixed Votes: 0
Labels: servicearch-wfbf-day
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v4.4
Sprint: Service arch 2020-10-05
Participants:
Linked BF Score: 24

 Description   

The mirrored reads code triggers a fatal assertion if the response from the mirroring request is non-ok: https://github.com/mongodb/mongo/blob/1373280c254a39d2ca7d85563718a6f74c927216/src/mongo/db/mirror_maestro.cpp#L382-L386

As far as I can tell, there's no basis for that assumption, the command could fail for any number of reasons, including network errors.



 Comments   
Comment by Githook User [ 25/Sep/20 ]

Author:

{'name': 'Ben Caimano', 'email': 'ben.caimano@10gen.com'}

Message: SERVER-50892 Mirrored reads tests should excuse retriable errors
Branch: master
https://github.com/mongodb/mongo/commit/4eaed94d25b5137370ae00652656cab589257399

Comment by Benjamin Caimano (Inactive) [ 11/Sep/20 ]

Hmmmmm, so here's the rub:
The code that has the fassert is test only on a no passthrough suite. We're not expecting any failures, and this is the concrete way we can notice if there are any. That said, we're not in a vacuum. The failure we're seeing here in BF-18721 is HostUnreachable, probably related to DNS failures. So obviously we can still get network errors, and if we can get network errors we can also get topology errors. How about we change the fatal assertion to check for OK or RetriableErrors? I think that would cover the domain of acceptable errors. We still don't want a BadValue or something to slip through.

Generated at Thu Feb 08 05:23:55 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.