[SERVER-39165] Add waitForFailpoint command Created: 23/Jan/19 Updated: 29/Oct/23 Resolved: 25/Oct/19 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Internal Code |
| Affects Version/s: | None |
| Fix Version/s: | 4.3.1, 4.2.2 |
| Type: | Task | Priority: | Major - P3 |
| Reporter: | Matthew Saltz (Inactive) | Assignee: | Cheahuychou Mao |
| Resolution: | Fixed | Votes: | 3 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||||||||||||||||||
| Backport Requested: |
v4.2
|
||||||||||||||||||||||||||||||||||||
| Sprint: | Sharding 2019-02-11, Sharding 2019-10-21, Sharding 2019-11-04 | ||||||||||||||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||||||||||||||
| Linked BF Score: | 50 | ||||||||||||||||||||||||||||||||||||
| Description |
|
Right now if we want to wait for a failpoint to be hit from a jstest, we need to parse log output to wait until the failpoint outputs some log message. It'd be much easier if we had a waitForFailpoint command that could wait until a failpoint got hit. This would require some "barrier"-esque functionality on the failpoint, which could potentially be useful for its own reasons. |
| Comments |
| Comment by Githook User [ 31/Oct/19 ] | ||||
|
Author: {'username': 'cheahuychou', 'email': 'cheahuychou.mao@mongodb.com', 'name': 'Cheahuychou Mao'}Message: (cherry picked from commit f79498e9aa27765fc363fcb85a65f47af6e20ee1) | ||||
| Comment by Githook User [ 25/Oct/19 ] | ||||
|
Author: {'username': 'cheahuychou', 'email': 'cheahuychou.mao@mongodb.com', 'name': 'Cheahuychou Mao'}Message: | ||||
| Comment by Githook User [ 25/Oct/19 ] | ||||
|
Author: {'username': 'cheahuychou', 'email': 'cheahuychou.mao@mongodb.com', 'name': 'Cheahuychou Mao'}Message: | ||||
| Comment by Githook User [ 25/Oct/19 ] | ||||
|
Author: {'username': 'cheahuychou', 'email': 'cheahuychou.mao@mongodb.com', 'name': 'Cheahuychou Mao'}Message: | ||||
| Comment by Siyuan Zhou [ 11/Oct/19 ] | ||||
|
I feel configuringFailPoint always resetting to 0 is more intuitive since it's more like a "semaphore" after the resetting. I cannot think of a case of the first concern where waiting on the same fail point more than once without resetting it is necessary. Here are the three use cases lingzhi.deng and I discussed if we extend the design in SERVER-42308 and expose the counter:
My point is whether we can figure out an interface that doesn't conflict with SERVER-42308, but I also agree, without an audit of existing complex use cases of fail points, making design choices is more of a guesswork. This ticket is definitely a step forward anyway. | ||||
| Comment by Mira Carey [ 09/Oct/19 ] | ||||
|
There are a couple of things returning a count helps avoid:
I'm also interested in making the minimum diff necessary here. There was a lot of complexity in that signaling design, and not an obvious way forward. I think adding this count, and leaving it's orchestration more or less to the client, is least likely to conflict with whatever you end up pulling together. And if we end up deciding later on that reusing this counter (but with zeroing) is the right thing to do, I'd rather pay that cost then, rather than engineer for the possibility now (because I'm guessing the cost won't be all that high). | ||||
| Comment by Lingzhi Deng [ 08/Oct/19 ] | ||||
|
Is there any particular reason we want to expose the count to client? Or is there any particular reason / use cases that we don't want reset the count on configureFailPoint? The reason I asked is because I still hope SERVER-42308 can be (re)implemented on top of your waitForFailPoint design based on the new atomic counter. | ||||
| Comment by Mira Carey [ 07/Oct/19 ] | ||||
|
Laying out current thoughts on design here:
That would then give a flow like:
| ||||
| Comment by Mira Carey [ 02/Aug/19 ] | ||||
|
re-opening, as SERVER-42308 looks to be a little more complicated to implement. We may still do this first (or in addition to that ticket) | ||||
| Comment by Mira Carey [ 24/Jul/19 ] | ||||
|
SERVER-42308 looks like a slightly more comprehensive ticket that subsumes this one. Given that, I'm going to close this out as a dup and try and pick up the related functionality over in that ticket |