[SERVER-37178] [POC] Allow change streams to work with "blocking" implementation of majority reads Created: 17/Sep/18  Updated: 10/Oct/18  Resolved: 10/Oct/18

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: None
Fix Version/s: None

Type: Task Priority: Major - P3
Reporter: William Schultz (Inactive) Assignee: William Schultz (Inactive)
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File blocking_majority_reads_poc.diff    
Issue Links:
Related
is related to SERVER-37560 Allow change streams to work with spe... Closed
Sprint: Repl 2018-10-08, Repl 2018-10-22
Participants:

 Description   

We would like to investigate and prototype a solution for implementing change streams correctly without storage engine support for majority reads. The idea is to achieve this by having read queries block until the data they read is majority committed, as opposed to reading from a majority committed snapshot. A core goal of this POC would be to get all (or most) of the change streams tests to pass when using this alternate implementation.



 Comments   
Comment by William Schultz (Inactive) [ 10/Oct/18 ]

charlie.swanson Yep, that is a test that doesn't play nice with the blocking majority reads implementation. It opens a change stream cursor, pauses replication on secondaries, inserts a document on the primary, and then waits to make sure the next returned batch is empty. By default, this just hangs when using blocking majority reads, since it will indefinitely wait on an optime to become committed that won't until replication is restarted. It is possible to make the test fail instead of hang if you specify a maxTimeMS on the initial aggregate command and the getMore. That is, it just returns an error instead of an empty batch. If there is a limited set of tests like this, we may consider writing analogous versions for the blocking majority reads implementation, or altering the test to support both implementations. I will think on this more.

Comment by Charlie Swanson [ 10/Oct/18 ]

william.schultz you point out a good gap in our coverage - we do have one test that I can find which specifically tests that the majority read concern is working: change_streams_require_majority_read_concern.js. Does this test pass with your patch build? Or I just realize that perhaps it can no longer be run if you disable majority read concern support?

Comment by William Schultz (Inactive) [ 10/Oct/18 ]

I think that this POC has successfully served its purpose to help inform the design of change streams with blocking majority reads. At this point, I think that more time spent working on POC code changes would be better served by just working on the full implementation. A diff of the POC changes is attached ( blocking_majority_reads_poc.diff , diffed against 860b39). The insights I feel were gained from this work are as follows:

1. Gained a better understanding of existing change streams code and how to integrate blocking majority reads mechanism. One notable point is that we will likely need to do the majority waiting for getMore inside the command itself, and not in the generic command processing pipeline, since there's no good way to get access to the read concern associated with a cursor from within ServiceEntryPoint. Also, I learned that updateLookup for local mongod queries should automatically read from the same snapshot as the change event read, so it should automatically give majority semantics as long as we block after the command execution has finished.

2. Acquired a better perspective on change streams test coverage, and also realized to be wary of existing coverage for testing the blocking majority reads behavior. It seems that there are many tests don't necessarily depend on the fact that change streams use majority read concern. That is, there are a lot of change streams tests that will read from a node whose lastApplied==lastCommitted. In this case, even if majority reads were disabled or weren't working correctly, it doesn't seem like these tests would notice it. For example, I was able to get the basic change_streams test suite to pass using the POC implementation, but this probably doesn't really tell us much since this only uses a single node replica set, there are no failovers, etc. When doing the full implementation, we should be careful to add some new change streams tests that explicitly test cases when change stream queries actually induce waiting on nodes, in addition to waiting for events against nodes that subsequently undergo roll back. I think that the Jepsen test for blocking majority reads should help improve the coverage in this area, since it induces many rollbacks throughout the test. I didn't think it was worth writing more extensive test coverage as a part of the POC, since it didn't feel valuable in terms of proving out the concept of the implementation. We may want to consider a change streams passthrough that kills nodes or induces rollbacks as well, if we think the time investment is worth it.

Comment by William Schultz (Inactive) [ 20/Sep/18 ]

Note: It may make sense to implement this prototype on the 4.0 branch.

Generated at Thu Feb 08 04:45:15 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.