[SERVER-34943] failCommand failpoint should ignore commands from replica set members Created: 10/May/18 Updated: 08/Jan/24 Resolved: 08/Jan/19 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | None |
| Fix Version/s: | 4.0.6, 4.1.7 |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Shane Harvey | Assignee: | A. Jesse Jiryu Davis |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||||||||||
| Backwards Compatibility: | Minor Change | ||||||||||||||||||||||||||||||||
| Backport Requested: |
v4.0
|
||||||||||||||||||||||||||||||||
| Sprint: | Repl 2018-06-04, Repl 2018-12-17 | ||||||||||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||||||||||
| Description |
|
When a secondary runs a command on the primary, it will trigger the failCommand failpoint. This is unexpected for drivers because we would like to use failCommand to test retryable writes and transactions on replica sets. Some options I see:
|
| Comments |
| Comment by Githook User [ 01/Feb/19 ] | ||||||||
|
Author: {'name': 'vincentkam', 'email': 'vincent.kam@10gen.com', 'username': 'vincentkam'}Message: Update retryable-reads tests to use "operations"
| ||||||||
| Comment by Githook User [ 18/Jan/19 ] | ||||||||
|
Author: {'username': 'ajdavis', 'email': 'jesse@mongodb.com', 'name': 'A. Jesse Jiryu Davis'}Message: | ||||||||
| Comment by Shane Harvey [ 16/Jan/19 ] | ||||||||
|
Yes we would like to use the failCommand fail point to test the behavior of find/getMore within a transaction without clashing with find/getMores from secondaries. For example, the "add transient label to connection errors" test can be extended to include getMore as well. | ||||||||
| Comment by A. Jesse Jiryu Davis [ 15/Jan/19 ] | ||||||||
|
Another question about the backport: since we're not testing sharded transactions or retryable reads in 4.0.x, and the transactions tests we want for 4.0 replica sets only test writes, is it actually necessary to backport this ticket to 4.0? This ticket prevents a find or getMore from a secondary from spuriously triggering a failpoint, but the transactions tests mostly do writes. Remind me: the backport's needed because some of the transactions tests do involve find and getMore? Thanks. | ||||||||
| Comment by A. Jesse Jiryu Davis [ 15/Jan/19 ] | ||||||||
|
It's | ||||||||
| Comment by Jeremy Mikola [ 15/Jan/19 ] | ||||||||
|
I think the primary use case for failCommand is related to transactions, so I think we can do without backporting the fail point support to mongos 4.0.x. IIRC, the only other place drivers use this fail point is in retryable writes' Network Error Tests – we don't run those tests on mongos due to the missing fail point today, so I think it's fine if we defer doing so until 4.2+. Can you point me to the server ticket where failCommand support was added to mongos in 4.2? I'd like to create a spec ticket to instruct drivers to start running those retryable writes test on mongos 4.2+. | ||||||||
| Comment by A. Jesse Jiryu Davis [ 15/Jan/19 ] | ||||||||
|
jmikola, it was pointed out to me that mongos doesn't support the failCommand fail point at all in 4.0.x. Do you think it's worth backporting the whole failCommand fail point to mongos for 4.0.x, or is it sufficient to backport the failCommand fix for replica sets? I've completed the replica set fix in a code review, the mongos backport would be much more effort. | ||||||||
| Comment by A. Jesse Jiryu Davis [ 08/Jan/19 ] | ||||||||
|
Requesting backport to 4.0. | ||||||||
| Comment by Jeremy Mikola [ 08/Jan/19 ] | ||||||||
|
jesse: Is there another ticket for the requested backport to 4.0.x? If not, should this issue have a label indicating the backport request? Just want to make sure it wasn't overlooked. | ||||||||
| Comment by Githook User [ 12/Dec/18 ] | ||||||||
|
Author: {'name': 'A. Jesse Jiryu Davis', 'email': 'jesse@mongodb.com', 'username': 'ajdavis'}Message: | ||||||||
| Comment by A. Jesse Jiryu Davis [ 10/Dec/18 ] | ||||||||
|
Drivers team (shane.harvey and vincent.kam) request a backport to 4.0.x for more consistent testing. | ||||||||
| Comment by A. Jesse Jiryu Davis [ 10/Dec/18 ] | ||||||||
|
Oh good point, I was stuck in mistaken thinking. Yes, calling "configureFailpoint" on the mongos will work for testing Retryable Reads with a sharded cluster. | ||||||||
| Comment by Andy Schwerin [ 10/Dec/18 ] | ||||||||
|
Why not fail find commands in the router for testing sharded clusters? Is On Mon, Dec 10, 2018, 7:35 AM A. Jesse Jiryu Davis (Jira) <jira@mongodb.org> | ||||||||
| Comment by A. Jesse Jiryu Davis [ 10/Dec/18 ] | ||||||||
|
Actually, I think we need to reconsider. The goal is to test Retryable Reads with the failCommand failpoint, and ignore replication finds and getMores. Ignoring "internalClient" works great for that in a multi-node replica set, but it doesn't work in a sharded cluster of multi-node replica sets. The problem is that mongos->mongod connections are also marked as "internal". So if we want failCommand to fail the driver's "find" command in a sharded cluster, we have only uncomfortable options:
I'd like to propose a new "replication: true" parameter again. If failCommand ignores replication connections, we can fulfill the goal of testing Retryable Reads, and we can test it the same with all topology types! This is much better. It's true that Andy's $out scenario will also appear to be a "replication connection", but we don't plan to test that $out scenario with the failCommand failpoint, whereas we do plan to test "find" and "getMore" with failCommand. I think that "replication: true" solves the problem we want to solve. | ||||||||
| Comment by A. Jesse Jiryu Davis [ 03/Dec/18 ] | ||||||||
|
That sounds simple: "failCommand ignores all internalClient connections". I'll proceed. | ||||||||
| Comment by Tess Avitabile (Inactive) [ 03/Dec/18 ] | ||||||||
|
That's a good point. We have the internalClient field in isMaster, which is used to tag transport sessions with kInternalClient. We could use that here. | ||||||||
| Comment by Andy Schwerin [ 03/Dec/18 ] | ||||||||
|
I'm not a fan of it. For one thing, we can't actually distinguish "replication" vs other internal network connections, because they share a pool on the "client node". For example, if you're doing aggregation with $out, a secondary might be sending $out writes to the primary of its own replica set as part of executing the user aggregation. That's not a "replication" connection, is it? What about when it's sending those writes to the primary of a different shard replica set? I think it will be difficult to manage. There may already be another signal that the connection is from an "internal" operator, though. Would that be enough? I think there's something because fcv uses it to control the reported wire protocol version. I'll find you IRL to chat, and we can sum up here after. | ||||||||
| Comment by A. Jesse Jiryu Davis [ 03/Dec/18 ] | ||||||||
|
I don't think we should do the solution where failCommand ignores commands from the __system user: that will require auth for testing (which is a new requirement) and if you forget to enable auth, some tests will fail some of the time for a very subtle reason. We'll have to remember that detail forever. I propose adding an isMaster parameter, "replication: true", that's passed on the replication, initial-sync, and heartbeat connections. This allows replication connections to be tagged, and in shouldActivateFailCommandFailPoint we'll excuse replication connections from failCommand. schwerin, we've discussed extensions to isMaster in the past, do have objections to passing "replication: true" with the initial isMaster on replication connections?
| ||||||||
| Comment by Vincent Kam (Inactive) [ 20/Nov/18 ] | ||||||||
|
Thanks tess.avitabile! | ||||||||
| Comment by Tess Avitabile (Inactive) [ 19/Nov/18 ] | ||||||||
|
vincent.kam, thanks for the information. We will do our best to get this complete before driver testing starts. | ||||||||
| Comment by Vincent Kam (Inactive) [ 19/Nov/18 ] | ||||||||
|
Thank you Tess! The spec proof of concept is currently due 2019-1-15, so ideally it'd be nice to have this ticket completed ~two weeks before then: my apologies for the tight timing! >> 2. Is it essential to test against multinode replica sets[?} Theoretically, drivers may be able to limit testing to mongoses and standalones, but it would be highly desirable to test against multinode replica sets. >> 3. [I]s this the only way to test against multinode replica sets? Without the requested enhancements to failpoints, we currently do not see any way for drivers to test against multinode replica sets. [edit] Dates. | ||||||||
| Comment by Tess Avitabile (Inactive) [ 16/Nov/18 ] | ||||||||
|
vincent.kam, yes, that's something we could consider. I will reopen this ticket and put it in Needs Scheduling for us to triage. Can you please help us determine the priority/timing of this work? When would the drivers plan to start retryable reads testing? Is it essential to test against multinode replica sets, and is this the only way to test against multinode replica sets? | ||||||||
| Comment by Vincent Kam (Inactive) [ 16/Nov/18 ] | ||||||||
|
I believe this issue will affect retryable reads testing. For example, my current understanding is that when trying to test a retryable db.coll.find(...) on a replicaset, we would create a failPoint like so:
However, the secondaries tailing the oplog could trigger this failpoint, creating a race condition while testing. Would it be possible to implement the solution proposed by Andy and Spencer in the comments above? This issue isn't' a blocker, as drivers could theoretically only test against only standalones, single node replicasets, and mongoses, but it would definitely be nice to test against multinode replicasets as that's one of the primary use cases for retryable reads. | ||||||||
| Comment by Tess Avitabile (Inactive) [ 22/May/18 ] | ||||||||
|
Instead we will implement the more general solution in | ||||||||
| Comment by Shane Harvey [ 15/May/18 ] | ||||||||
|
I believe that drivers can work with either of those solutions. We are generally only interested in using failCommand to test retryable write commands both inside and outside transactions (including abortTransaction and commitTransaction). | ||||||||
| Comment by Spencer Brody (Inactive) [ 15/May/18 ] | ||||||||
|
Another alternative, which would be made possible if | ||||||||
| Comment by Shane Harvey [ 14/May/18 ] | ||||||||
|
Sounds good to me. I think we can easily limit these driver tests to only run against replica sets with access control enabled. | ||||||||
| Comment by Spencer Brody (Inactive) [ 14/May/18 ] | ||||||||
|
The simplest way to do this is probably as andy suggests and use the access control system. We could skip activating the failpoint on connections authenticated as the __system user. The downside to this approach is it wouldn't work when access control is disabled. Does that work for you shane.harvey? | ||||||||
| Comment by Spencer Brody (Inactive) [ 14/May/18 ] | ||||||||
|
Oh wait, you might also have problems with the 'find' and 'getmore' commands used for replicating the oplog. Then we probably do indeed also need a way to configure the failpoint to only activate for commands not sent by cluster members. | ||||||||
| Comment by Spencer Brody (Inactive) [ 14/May/18 ] | ||||||||
|
I think we should probably just add the 'replSetHeartbeat', 'replSetUpdatePosition', and 'replSetRequestVotes' commands to the command whitelist for now. Once | ||||||||
| Comment by Andy Schwerin [ 12/May/18 ] | ||||||||
|
Another option is to adjust the failpoint to only count for specific users. |