[SERVER-70127] Default system operations to be killable by stepdown Created: 30/Sep/22 Updated: 29/Oct/23 Resolved: 26/Apr/23 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | 7.1.0-rc0 |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Louis Williams | Assignee: | Jiawei Yang |
| Resolution: | Fixed | Votes: | 1 |
| Labels: | repl-shortlist | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Assigned Teams: |
Replication
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Sprint: | Repl 2023-03-06, Repl 2023-03-20, Repl 2023-05-01 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Linked BF Score: | 135 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Description |
|
There are currently 30ish non-test calls to setSystemOperationKillableByStepdown(). Every time we introduce a new thread, there’s a non-obvious requirement to call that function. Failing to do so results in the process crashing if the operation hits a prepare conflict. This is a rare occurence, which means we risk not catching crashing bugs in testing. In addition to the visual clutter, the API risks that developers create new internal threads that are unkilllable when they shouldn't be. It seems that there are only a few system operations that actually need to be unkilllable and the vast majority of all threads should be killable. We should consider changing the default such that system operations are always killable and have the limited set of special operations explicitly opt-in to being unkillable. |
| Comments |
| Comment by Githook User [ 19/May/23 ] |
|
Author: {'name': 'Josef Ahmad', 'email': 'josef.ahmad@mongodb.com', 'username': 'josefahmad'}Message: (cherry picked from commit https://github.com/10gen/mongo/commit/846ed8250b4c3322e52c2e51c4a6992c6ce7ba34) This cherry-pick also marks the stepup async task as killable by stepdown, Conflicts: |
| Comment by Githook User [ 26/Apr/23 ] |
|
Author: {'name': 'Jiawei Yang', 'email': 'jiawei.yang@mongodb.com', 'username': 'YoungYang0820'}Message: |
| Comment by Githook User [ 25/Apr/23 ] |
|
Author: {'name': 'Sviatlana Zuiko', 'email': 'sviatlana.zuiko@mongodb.com', 'username': 'szuiko'}Message: Revert " This reverts commit c35bad3b048e8d885bf0b7517aacd2349ea81d14. |
| Comment by Githook User [ 25/Apr/23 ] |
|
Author: {'name': 'Jiawei Yang', 'email': 'jiawei.yang@mongodb.com', 'username': 'YoungYang0820'}Message: |
| Comment by Jiawei Yang [ 13/Apr/23 ] |
|
This is reverted for safely shipping 7.0.0-rc0 and will be recommitted after rc0 branch cut. |
| Comment by Jiawei Yang [ 05/Apr/23 ] |
|
Hi yujin.kang@mongodb.com, thanks for asking. This is planning to be done soon after rc0 branch cut. |
| Comment by Githook User [ 30/Mar/23 ] |
|
Author: {'name': 'Jiawei Yang', 'email': 'jiawei.yang@mongodb.com', 'username': 'YoungYang0820'}Message: Revert " This reverts commit 9f2867c9da77e2d64df3852f7d4578f10e6f0817. Revert " This reverts commit 26266d5b736f90961a328399dea5d299cd407ab2. |
| Comment by Githook User [ 13/Mar/23 ] |
|
Author: {'name': 'Jiawei Yang', 'email': 'jiawei.yang@mongodb.com', 'username': 'YoungYang0820'}Message: |