[SERVER-48083] Support rolling back a collMod oplog entry that modifies an index when using rollback via refetch Created: 11/May/20 Updated: 22/Jun/20 Resolved: 22/Jun/20 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Task | Priority: | Major - P3 |
| Reporter: | Mihai Andrei | Assignee: | Mihai Andrei |
| Resolution: | Won't Fix | Votes: | 0 |
| Labels: | qexec-team | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Sprint: | Query 2020-06-29 | ||||||||
| Participants: | |||||||||
| Description |
|
Currently, attempting to roll back a collMod oplog entry that targets an ‘index’ field will trigger an fassert during rollback via refetch. This is true for collMod commands that modify an index’s hidden field as well as those which modify an index’s TTL option. Since changes to an index’s TTL option are not considered eligible for rollback via refetch, neither are changes to an index’s hidden option considered eligible at present. |
| Comments |
| Comment by Mihai Andrei [ 22/Jun/20 ] |
|
Filed |
| Comment by Mihai Andrei [ 16/Jun/20 ] |
|
That sounds reasonable; I think the place to note this would be within the documentation for enableMajorityReadConcern. If it's alright with everyone, I'll close this ticket as won't fix and file a docs ticket to update the linked section in the docs. |
| Comment by Daniel Gottlieb (Inactive) [ 16/Jun/20 ] |
|
If we choose to go ahead with advertising hidden indexes in our 4.4 docs, we should add a note to our docs that replication rollback fails to undo the collMod operation and that the affected nodes can only recover with a resync. |
| Comment by Mihai Andrei [ 16/Jun/20 ] |
|
daniel.gottlieb Yes, hidden indexes are documented for 4.4, per DOCSP-8660. As to whether there is a workaround for this that doesn’t involve resyncing, I’m not sure. |
| Comment by Daniel Gottlieb (Inactive) [ 16/Jun/20 ] |
|
Are hidden indexes documented for 4.4? If a user hits the fassert trying to rollback the operation, is there a workaround to getting the node back into the replica set without requiring a resync? |
| Comment by Mihai Andrei [ 16/Jun/20 ] |
|
william.schultz With regards to your interpretation of how hidden indexes is used, I think your intuition is correct: a user might configure an index's hidden field during some time window, after which they keep it hidden or disable it indefinitely. Unless anyone feels strongly about doing this work or it comes to pass that eMRC=false won't be removed in 4.6, I'm going to close this ticket as 'Won't fix' . |
| Comment by William Schultz (Inactive) [ 11/Jun/20 ] |
|
mihai.andrei With regards to implementation, it turns out we actually added initial support for making collMod oplog entries "reversible" in To address your questions about priority/value of implementing this feature: it seems likely that enableMajorityReadConcern:false will be going away in 4.6 (PM-1769), but I don't think it has been firmly decided yet. milkie might have the better understanding of that decision. So, we would then only be worried about supporting rollback of these hidden index changes on 4.4 clusters that use enableMajorityReadConcern:false. All things considered, I would predict that roll back of a hidden index change operation on a EMRC=false node is relatively rare, but determining whether we need/want to support that might be more of a question for some product people, if they, for example, have a better sense of how many users will also be using the hidden indexes feature. With my current understanding, I would probably think that it is reasonable to not add support for it in rollbackViaRefetch. On the other hand, if the implementation is relatively simple (and doesn't require refetching any extra documents from the sync source, for example), then it might be worth doing. To also address milkie's point:
If I understand correctly, a hidden index might be built but not enabled yet, so that users can optionally "turn it on" to see how it affects performance. This doesn't seem like a very frequent operation i.e. you might turn it on and test the index performance for a bit, and then possibly make a decision to turn it off or leave it on permanently. I don't really see why users would be continually turning it off and on. I suppose if they have many indexes and/or many collections they might be testing out various combinations of hidden/non-hidden indexes, which could contribute to a higher rate of hidden index modifications over a certain period of time. My intuition, however, would be that these changes would be relatively localized in time i.e. there might be some experimentation window where users are learning about the perf characteristics of certain indexes, and then end up leaving the hidden settings fixed. Of course, this intuition could be misaligned with how users will actually use the feature. mihai.andrei you might be able to make a better judgement on it than me. |
| Comment by Mihai Andrei [ 10/Jun/20 ] |
|
If I understand correctly, rollback via refetch only gets used when enableMajorityReadConcern is set to false, forceRollbackViaRefetch is set to true, or the ephemeralForTest storage engine is used. Given this, the decision to add support for rolling back collMod entries which modify an index’s ‘hidden’ field (and not the TTL option, since this isn't supported as of
For what it’s worth, I don’t think this fix is terribly complicated, since, if I understand rollback via refetch correctly, undoing a collMod on an index's hidden field consists of inverting its current value (as well as doing any necessary work to remove redundant operations like flipping the hidden value of an index that is going to be dropped, for example). That said, I would like to get a sense of how important it is to support this before committing to implementing it. Intuitively, I would think that if we don't support rolling back a collMod for TTL options, we wouldn't do so for any index options (including hidden), but milkie makes a good point that modifying an index's TTL option would probably happen less frequently than modifying its hidden option. |
| Comment by Eric Milkie [ 12/May/20 ] |
|
The problem that Mihai is describing is related to the code that was added as part of the Hidden Indexes project. We never implemented rollback for index TTL values, so the new code added by Hidden Indexes didn't get that support either. This ticket is to consider adding that support, presumably because adding/removing the hidden field is going to be more common than changing an index's TTL time, and thus the danger of hitting this fassert is greater. |
| Comment by Craig Homa [ 12/May/20 ] |
|
It isn't clear to Query that this falls under their responsibilities and we are not sure why it is currently not feasible to rollback a collMod oplog entry under these circumstances. Could you shed some light on what work would be needed to permit this? CC milkie and judah.schvimer |