[SERVER-56839] Index seeks concurrent with recently-committed prepared transactions can return wrong results Created: 11/May/21 Updated: 29/Oct/23 Resolved: 25/May/21 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | 4.2.0, 4.4.0, 5.0.0 |
| Fix Version/s: | 4.4.7, 5.0.0-rc2, 4.2.16, 5.1.0-rc0 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Louis Williams | Assignee: | Louis Williams |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Operating System: | ALL | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Backport Requested: |
v5.0, v4.4, v4.2
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Steps To Reproduce: | Create this failpoint:
And run this test:
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Sprint: | Execution Team 2021-05-31 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Linked BF Score: | 50 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Description |
|
Read-only queries that perform index scans have the potential to return the wrong keys if they scan near recently-committed prepared transactions. The bug is as follows, in this order:
This does not affect write operations since they enforce and block on prepare conflicts. This only affects read-only queries. |
| Comments |
| Comment by Vivian Ge (Inactive) [ 06/Oct/21 ] |
|
Updating the fixversion since branching activities occurred yesterday. This ticket will be in rc0 when it’s been triggered. For more active release information, please keep an eye on #server-release. Thank you! |
| Comment by Githook User [ 13/Jul/21 ] |
|
Author: {'name': 'Louis Williams', 'email': 'louis.williams@mongodb.com', 'username': 'louiswilliams'}Message: This fixes a bug that can result in a seek on an index returning the (cherry picked from commit b18749767fc53a9b5822312a063afb26883a774a) |
| Comment by Githook User [ 16/Jun/21 ] |
|
Author: {'name': 'Louis Williams', 'email': 'louis.williams@mongodb.com', 'username': 'louiswilliams'}Message: This fixes a bug that can result in a seek on an index returning the (cherry picked from commit b18749767fc53a9b5822312a063afb26883a774a) |
| Comment by Githook User [ 15/Jun/21 ] |
|
Author: {'name': 'Louis Williams', 'email': 'louis.williams@mongodb.com', 'username': 'louiswilliams'}Message: This fixes a bug that can result in a seek on an index returning the (cherry picked from commit b18749767fc53a9b5822312a063afb26883a774a) |
| Comment by Githook User [ 25/May/21 ] |
|
Author: {'name': 'Louis Williams', 'email': 'louis.williams@mongodb.com', 'username': 'louiswilliams'}Message: This fixes a bug that can result in a seek on an index returning the |
| Comment by Daniel Gottlieb (Inactive) [ 17/May/21 ] |
|
Thanks for the clarifications louis.williams! I've responded with updated claims for posterity.
Ah, that's an invariant that makes intuitive sense to me. I can see why violating that promise can lead to wrong results.
I think I see. WT makes a txnId "visible" when a transaction is prepared (meaning new readers won't add said txnId to its copy of "concurrent transactions"). But the individual updates themselves are obviously marked as prepared, resulting in a WT_PREPARE_CONFLICT until that's no longer true. |
| Comment by Louis Williams [ 17/May/21 ] |
|
daniel.gottlieb, I updated some text to hopefully answer your question:
The MongoDB code assumes that if search_near lands on a key that compares lower than the search key, calling next() is guaranteed to return a key that compares higher than the search key. The same logic also applies in the other direction for reverse cursors. "ignore_prepare" doesn't guarantee snapshot isolation and therefore this may not always be true. To answer your first question: if an operation reads without a timestamp, as is the case in my example, it gets a snapshot where an update is prepared. WiredTiger normally returns WT_PREPARE_CONFLICT for reads until this operation commits/aborts, but "ignore_prepare" allows the operator to return the pre-image. Once the prepared operation commits, it becomes visible. |
| Comment by Daniel Gottlieb (Inactive) [ 13/May/21 ] |
|
Clarification: Does the query for "c" have to be something that reads at a timestamp? If there's no read timestamp, my understanding is that if the query for "c" gets a snapshot prior to the prepared transaction committing, the "c" query's storage engine snapshot will never see "b" Question: I didn't understand the last bullet point regarding the bug:
Is the claim that higher level code isn't properly vetting any $match filters on returned documents? Or is the MDB Index API returning that we found an exact match when that wasn't true (perhaps allowing query to optimize away any double-checking of what gets returned)? |