[SERVER-60037] Enable the ordered timestamp assertion in MongoDB Created: 17/Sep/21 Updated: 29/Oct/23 Resolved: 15/Nov/21 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | 5.2.0 |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Luke Pearson | Assignee: | Yuhong Zhang |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | CA-PM, post-mortem | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||||||||||||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||||||||||||||||||||||||||||||||||
| Sprint: | Execution Team 2021-10-18, Execution Team 2021-11-01, Execution Team 2021-11-15, Execution Team 2021-11-29 | ||||||||||||||||||||||||||||||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||||||||||||||||||||||||||||||
| Linked BF Score: | 35 | ||||||||||||||||||||||||||||||||||||||||||||||||||||
| Description |
|
It would be helpful for WiredTiger if we could turn on some assertions about timestamps, and as discussed in a recent meeting we should try to enable the mixed_mode timestamp assertion. I have attempted this and created a MongoDB patch build to check the fallout and there was still a significant amount. This is a shared ticket and will require a developer from the WiredTiger team and SERVER as fixing the test failures will be an iterative process. When encountering a test failure a separate ticket should be created to fix the failure (if possible) and then this ticket will proceed once that ticket is completed, with the goal of one day enabling the assertion and merging it to master. The patch I used to enable it was (it may not apply correctly so just take the config change):
|
| Comments |
| Comment by Githook User [ 15/Nov/21 ] | |||
|
Author: {'name': 'Yuhong Zhang', 'email': 'danielzhangyh@gmail.com', 'username': 'YuhongZhang98'}Message: | |||
| Comment by Githook User [ 20/Oct/21 ] | |||
|
Author: {'name': 'Yuhong Zhang', 'email': 'danielzhangyh@gmail.com', 'username': 'YuhongZhang98'}Message: Revert " This reverts commit bf4cd96645253bb29a32ec0568095049bc49ab77. | |||
| Comment by Yuhong Zhang [ 19/Oct/21 ] | |||
|
Instead of following the approach luke.pearson suggested, we skipped the checking on certain namespaces to enable the ordered timestamp assertion for the rest of the system first and filed the linked tickets to investigate the solutions to the existing issues. | |||
| Comment by Githook User [ 19/Oct/21 ] | |||
|
Author: {'name': 'Yuhong Zhang', 'email': 'danielzhangyh@gmail.com', 'username': 'YuhongZhang98'}Message: | |||
| Comment by Luke Pearson [ 13/Oct/21 ] | |||
|
yuhong.zhang the only difference for you is to change write_timestamp_usage=mixed_mode to write_timestamp_usage=ordered the other configuration option remain the same. | |||
| Comment by Luke Pearson [ 13/Oct/21 ] | |||
|
I believe we were aiming lower as the "mixed-mode" assertion allows mixed mode updates while the ordered timestamp assertion does not. I am very happy to aim higher if it is achievable. | |||
| Comment by Daniel Gottlieb (Inactive) [ 13/Oct/21 ] | |||
|
I changed the title to turning on the "ordered" timestamp assertion which I believe is a stronger guarantee as it disallows mixed-mode and out of order writes. luke.pearson, please reach out if this is not the case, or if we're intentionally shooting for a lower bar. | |||
| Comment by Yuhong Zhang [ 13/Oct/21 ] | |||
|
Thanks to the pointers from gregory.noma, we suspect there might be some implementation details on multikey indexes that violate the in-order restriction. I managed to reproduce the issue with the following steps:
| |||
| Comment by Yuhong Zhang [ 01/Oct/21 ] | |||
|
I uploaded a new patch with all three configurations turned on as well as an extra comma to fix the parsing errors, and I believe now it's more accurately showing the list of failing tests. A quick look at the failing tasks: the only failing tests are "jstests/aggregation/group_conversion_to_distinct_scan.js" and "jstests/core/sort_merge.js" under different suites, as well as "StorageTimestampTests" and "repl" in dbtests. All failing tests failed this check and printed out something similar to
There isn't any failures on Enterprise RHEL variants (maybe the configs were not setup there?). | |||
| Comment by Luke Pearson [ 27/Sep/21 ] | |||
|
I'm unsure if the above patch build is valid however it turns out you should have minimum 3 configurations enabled and passed in the create config. | |||
| Comment by Luke Pearson [ 17/Sep/21 ] | |||
|
connie.chen, its not clear exactly which team in SERVER would be required, I think both replication and storage execution? Either way this ticket represents a lot of shared effort so keep that in mind for scheduling. louis.williams has some context around this ticket if you need further clarification. |