[SERVER-41888] Shutting down with prepared transaction can invariant during collection validation Created: 24/Jun/19 Updated: 29/Oct/23 Resolved: 08/Jul/19 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | None |
| Fix Version/s: | 4.2.0-rc3, 4.3.1 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Vesselina Ratcheva (Inactive) | Assignee: | Lingzhi Deng |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||||||||||||||
| Backwards Compatibility: | Minor Change | ||||||||||||||||||||||||||||||||||||
| Operating System: | ALL | ||||||||||||||||||||||||||||||||||||
| Backport Requested: |
v4.2
|
||||||||||||||||||||||||||||||||||||
| Steps To Reproduce: |
|
||||||||||||||||||||||||||||||||||||
| Sprint: | Repl 2019-07-15 | ||||||||||||||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||||||||||||||
| Linked BF Score: | 0 | ||||||||||||||||||||||||||||||||||||
| Description |
|
If a test shuts down a node without committing or aborting a prepared transaction, collection validation will run into a prepare conflict. Previously, this would result in a hang. After |
| Comments |
| Comment by Githook User [ 08/Jul/19 ] |
|
Author: {'name': 'Lingzhi Deng', 'username': 'ldennis', 'email': 'lingzhi.deng@mongodb.com'}Message: (cherry picked from commit 5915e1114d1e45e79642767c8184d514ac957245) |
| Comment by Githook User [ 08/Jul/19 ] |
|
Author: {'name': 'Lingzhi Deng', 'username': 'ldennis', 'email': 'lingzhi.deng@mongodb.com'}Message: |
| Comment by Vesselina Ratcheva (Inactive) [ 27/Jun/19 ] |
|
I left some relevant TODOs in a test I wrote, in this section. |
| Comment by Judah Schvimer [ 26/Jun/19 ] |
|
We do not document validate taking a read concern, so this isn't an API change. alyson.cabral, do you see any problems with not allowing validate to accept a read concern or afterClusterTime? |
| Comment by Geert Bosch [ 26/Jun/19 ] |
|
I think it is OK for validate to ignore prepare conflicts: we'd ignore the same conflict on both the collection and all indexes, so everything should still be consistent. It is expected that the result of validate does not include uncommitted transactions. |
| Comment by Geert Bosch [ 26/Jun/19 ] |
|
One of the reasons validate acquires an X lock is that the storage engine (at least in the case of WiredTiger) may not be able to actually do a full validation if there still are open cursors on the collection. Another reason is that validate may actually write to update some stats, such as document count and storage size, that may not be tracked perfectly in presence of failures and rollbacks, but are recomputed anyway during validation. Rather than trying to improve this, I think it is better to focus on background validation and eventually making that the default and only way to validate collections. |
| Comment by Judah Schvimer [ 24/Jun/19 ] |
|
The investigation in geert.bosch, why does validate acquire an X lock instead of an S lock? Is it ok for it to ignore prepare conflicts? Are there any other X locks we should be worried about? Is it a problem for validate to not accept afterClusterTime? |
| Comment by Bruce Lucas (Inactive) [ 24/Jun/19 ] |
|
Sounds like this is something a customer could encounter as well, and hitting an invariant during shutdown would not be an optimal behavior from a customer perspective. |