[SERVER-46818] Do untimestamped writes in IdempotencyTest Created: 12/Mar/20 Updated: 29/Oct/23 Resolved: 01/Apr/20 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | None |
| Fix Version/s: | 4.4.0-rc0, 4.7.0 |
| Type: | Task | Priority: | Major - P3 |
| Reporter: | Luke Pearson | Assignee: | Lingzhi Deng |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||
| Backport Requested: |
v4.4
|
||||||||||||
| Sprint: | Repl 2020-04-06 | ||||||||||||
| Participants: | |||||||||||||
| Description |
|
The unit test db_repl_test, more specifically: CheckUpdateSequencesAreIdempotent writes timestamps on modifies that are out of order. Could the test call rollback to stable between iterations? Or does it have to be this way? For more detail see |
| Comments |
| Comment by Githook User [ 08/Apr/20 ] |
|
Author: {'name': 'Lingzhi Deng', 'email': 'lingzhi.deng@mongodb.com', 'username': 'ldennis'}Message: (cherry picked from commit 53f344385322f7e9a93779e4a515167e766182cc) |
| Comment by Githook User [ 01/Apr/20 ] |
|
Author: {'name': 'Lingzhi Deng', 'email': 'lingzhi.deng@mongodb.com', 'username': 'ldennis'}Message: |
| Comment by Siyuan Zhou [ 31/Mar/20 ] |
I agree idempotency tests are not testing timestamping behavior and should work with untimestamped writes. |
| Comment by Tess Avitabile (Inactive) [ 31/Mar/20 ] |
|
I also support doing untimestamped writes. lingzhi.deng, I think I misunderstood the issue when we talked about doing untimestamped writes previously. I think it would be fine even if the test did all untimestamped writes, since I think the main concern of the tests is that the contents of oplog entries and semantics of their application is idempotent. |
| Comment by Judah Schvimer [ 31/Mar/20 ] |
|
I don't think these tests are testing timestamping behavior, just how we apply operations in a linear order, so I support just having them do untimestamped writes. |
| Comment by Daniel Gottlieb (Inactive) [ 31/Mar/20 ] |
|
I'm confident untimestamped writes for the "first phase" is logically equivalent to what initial sync can end up with. |
| Comment by Lingzhi Deng [ 31/Mar/20 ] |
|
Yes, it shouldnt be too hard to make the tests do untimestamped writes. But that would not match the initial sync behaviors either. I am not sure if that matters / if thats what we want to test. |
| Comment by Judah Schvimer [ 31/Mar/20 ] |
|
I think that SERVER-45827 could make sure that the initial sync fuzzer covers all cases that the idempotency tests cover today. Maybe I missed something in the above, but can the test just do all untimestamped writes? I think that would provide the same coverage as today without causing problems in the storage engine. It's a unittest, so it shouldn't be too hard to poke through some failpoint to turn off timestamping. |
| Comment by Eric Milkie [ 31/Mar/20 ] |
|
The change that made the test pass again was that the WiredTiger code assertions that update timestamps were in order were removed. It would be better if we could add those assertions back in, as there isn't really anything else preventing MongoDB code from doing this in the future, where it would most likely be a coding mistake. |
| Comment by Tess Avitabile (Inactive) [ 31/Mar/20 ] |
|
What was the change that was made to make the tests pass? Was the invariant turned off just for the test, or was is turned off always? If the invariant was turned off just for the test, then this seems like a fine workaround. Another thing we could consider is removing these tests if we now get coverage from other areas. judah.schvimer, siyuan.zhou, do you think it's possible that we get enough coverage from the initial sync fuzzer that we might no longer need the idempotency tests? |
| Comment by Daniel Gottlieb (Inactive) [ 30/Mar/20 ] |
|
A correction, now that I read the spawning I don't think we've added any new cases in a production system where MongoDB writes things out of timestamp order. The existing known cases (for a production system) were old-style background index builds (on the index entries themselves) and anything using a ghost timestamp (notably, background index builds completing on a secondary which could write to the catalog entries out of timestamp order). The only future project I know of that has the potential for reintroducing out of timestamp order writes would be having clustered indexes and allowing them on secondary (non-_id) indexes. Depending on how things are done, that could bring back the old mmap style "fetch missing documents due to an update oplog entry" behavior for initial sync. |
| Comment by Lingzhi Deng [ 30/Mar/20 ] |
|
Yes, ideally we need to somehow turn everything applied into untimestamped (like erase the history/consolidate the update chains) after each round of oplog application so that it better matches the initial sync behaviors in real world. Applying onto different documents would do it too I believe. To luke.pearson, the tests don't have to work that way. The idea of these tests is to test oplog idempotency even when the changes are already (or partially) reflected in the data. So we cant call rollback_to_stable in these test as that defeats the purposes of these tests. That said, as Dan summarized about, the implementation of the tests that uses oplog application to generate arbitrary versions and then re-apply those entries is not valid and doesnt match real-world mongodb behaviors. |
| Comment by Eric Milkie [ 30/Mar/20 ] |
|
Thanks, Dan – I agree that we should keep this test then. Perhaps instead of applying the oplog entries to the same document twice, we could do this: |
| Comment by Daniel Gottlieb (Inactive) [ 30/Mar/20 ] |
I think the specific problem luke.pearson is mentioning has to do with WT_MODIFY operations. Typical WT_UPDATES are saving a full copy of the new bson document. A WT_MODIFY just saves the diff. My understanding is that prior to durable history, WiredTiger always pulls updates out of cache overflow in the same order they went in. However after durable history, the update chain is restored in timestamp order. An application writing timestamps out of order could be reordered. Reordering a series of WT_MODIFY is likely to result in a bad BSONObject at the end. Given the test likely only cares about the last version after applying all operations, using WT_UPDATES sweeps the problem under the rug (the last update should always be >= any prior update timestamp).
The test is still valid in spirit (unless other things exercise idempotency in other ways). Replication still depends on idempotency during initial sync. As lingzhi.deng said, the cloned documents are really just a version of a document after applying some arbitrary number of operations. The syncing node can only lower bound (as in timestamp value to start applying from) which oplog entry to start with. It's just the implementation detail that the test uses oplog application to generate arbitrary versions is no longer valid. |
| Comment by Lingzhi Deng [ 30/Mar/20 ] |
|
If my understanding is correct, I am afraid we might need a bigger refactoring (or rewriting) of the idempotency tests in order to match the real-world mongodb behaviors. |
| Comment by Lingzhi Deng [ 30/Mar/20 ] |
|
I guess all idempotency unittests in general may have the same problem. It looks like in testOpsAreIdempotent, we always apply all oplog entries once first and then re-apply them (or a subset of them) again based on the sequenceType. So if we use testOpsAreIdempotent, shouldnt there always be a chance where older oplog entries are replayed on top of newer ones? I think the general idea for these tests is to simulate re-applying oplog entries while the changes are already reflected in the data – i.e. idempotency. But in the real-world case of initial sync, I think we always do untimestamped writes during data cloning and so even if the changes are already reflected in the data, they would be untimestamped before re-applying the oplog entries. So it seems that idempotency tests do not quite match with what could happen in the real world. And as Eric pointed out, we shouldnt be replaying older oplog entries on top of newer ones. |
| Comment by Eric Milkie [ 12/Mar/20 ] |
|
This test was originally written without timestamps in mind, and I suspect things just sort of worked by accident after we added timestamping to MongoDB. I suspect this particular test may simply need to be deleted now, as we never replay older oplog entries on top of newer ones without an interim call to recover-to-timestamp. |