[SERVER-61820] Investigate test failures from WT-8165 Created: 30/Nov/21 Updated: 18/Jan/22 Resolved: 18/Jan/22 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Task | Priority: | Major - P3 |
| Reporter: | Louis Williams | Assignee: | Gregory Wlodarek |
| Resolution: | Won't Fix | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Sprint: | Execution Team 2022-01-24 | ||||||||
| Participants: | |||||||||
| Description |
|
A patch build for the work in WT-8165 has shown a few test failures. See this patch. Investigate the failures and determine if they can be fixed before merging WT-8165. If so, fix them. Otherwise, disable the tests and file follow-up tickets to re-enable and fix them to represent the new behavior. |
| Comments |
| Comment by Gregory Wlodarek [ 18/Jan/22 ] | ||||||||||||||
|
Sounds good andrew.morton, I'll close this as won't fix. | ||||||||||||||
| Comment by Andrew Morton [ 16/Jan/22 ] | ||||||||||||||
|
Thanks gregory.wlodarek! Based on this I'll drop the current WT-8165 PR and we'll look at a different approach. | ||||||||||||||
| Comment by Louis Williams [ 11/Jan/22 ] | ||||||||||||||
|
Thanks for the detailed explanation, gregory.wlodarek! luke.pearson/andrew.morton, it seems like we can't merge WT-8165 in its current state unless it continues to use the last commit timestamp for untimestamped writes in transactions. | ||||||||||||||
| Comment by Gregory Wlodarek [ 11/Jan/22 ] | ||||||||||||||
|
louis.williams I'm not sure there's anything we can do here. | ||||||||||||||
| Comment by Gregory Wlodarek [ 11/Jan/22 ] | ||||||||||||||
|
The issue behind these failures is caused by multi oplog entry transactions. As far as I'm aware this can only happen in two cases:
When committing transactions, the operations are packed into applyOps oplog entries. But if any of the two cases above happen, we have to split the transaction into multiple applyOps oplog entries. For example, say gMaxNumberOfTransactionOperationsInSingleOplogEntry=1 and two inserts are performed in a transaction.
Prior to committing, when we're doing the inserts, we're doing untimestamped writes on a storage transaction. Oplog entry generation happens when committing the transaction. When writing oplog entries, we set the storage transactions timestamp to that of the oplog entry and write the oplog entry. So once all of this has happened, we have the following storage transaction:
The failing tests are seeing the two inserts on a.b visible at Timestamp(2, 257) when they're expected to be visible at Timestamp(2, 258). It looks like WT-8165 sets the untimestamped writes with the first commit timestamp, which has historically been the last commit timestamp. MongoDB has a similar mechanism which is disabled for this case (partial transactions). |