[SERVER-41469] Enforce w:1 for creation of transactions table on step-up Created: 03/Jun/19 Updated: 29/Oct/23 Resolved: 01/Jul/19 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | None |
| Fix Version/s: | 4.2.0-rc4, 4.3.1 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Vesselina Ratcheva (Inactive) | Assignee: | Jason Chan |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||
| Operating System: | ALL | ||||||||||||||||
| Backport Requested: |
v4.2
|
||||||||||||||||
| Sprint: | Repl 2019-07-01, Repl 2019-07-15 | ||||||||||||||||
| Participants: | |||||||||||||||||
| Linked BF Score: | 9 | ||||||||||||||||
| Description |
|
We create the transactions table on step-up via a DBDirectClient call. That will inherit the default writeConcern, which is a problem if the user changed it from w:1. In that case, the call will wait on that WC immediately, while also holding locks (particularly the RSTL in mode X, from the step-up hook). We do not want to do this, as that can block other processes, including servicing find commands for replication. |
| Comments |
| Comment by Githook User [ 19/Jul/19 ] |
|
Author: {'name': 'Jason Chan', 'email': 'jason.chan@10gen.com', 'username': 'jasonjhchan'}Message: (cherry picked from commit a351f48ad122ca59ed45e5df877ef398c099c938) |
| Comment by Githook User [ 01/Jul/19 ] |
|
Author: {'name': 'Jason Chan', 'email': 'jason.chan@10gen.com', 'username': 'jasonjhchan'}Message: |
| Comment by Vesselina Ratcheva (Inactive) [ 11/Jun/19 ] |
|
judah.schvimer This is indeed a deadlock. The entire replica set can actually fail to make progress because of this, as the primary would not be able to service the finds required for secondaries to replicate the table, while waiting on exactly that table to be replicated. While this can only happen in the relatively niche use case where users modify the default write concern, the fix here is very straightforward and I think we should do it sooner. |
| Comment by Judah Schvimer [ 11/Jun/19 ] |
|
vesselina.ratcheva, what is the user-visible bug here? Is this a deadlock? If it's a bug then it doesn't feel like "Tech Debt". |
| Comment by Vesselina Ratcheva (Inactive) [ 03/Jun/19 ] |
|
While there hasn't been a 4.0 BF, this is also possible on that version, the only real difference being that we take Global X instead of RSTL X (which does not exist yet). |
| Comment by Judah Schvimer [ 03/Jun/19 ] |
|
vesselina.ratcheva, is this a 4.0 bug as well? |