[SERVER-43329] renameCollectionWithinDBForApplyOps should not use temp collections on primary Created: 13/Sep/19 Updated: 06/Dec/22 |
|
| Status: | Backlog |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Matthew Russotto | Assignee: | Backlog - Replication Team |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||
| Assigned Teams: |
Replication
|
||||||||||||
| Operating System: | ALL | ||||||||||||
| Participants: | |||||||||||||
| Linked BF Score: | 25 | ||||||||||||
| Description |
|
In the case of renaming a collection when the target exists but has a UUID which does not match the drop target UUID of the op being applied, we rename the old target out of the way. This is necessary for idempotence in oplog application, but we should not do it when using applyOps on the primary (when opCtx->writesAreReplicated()) . Instead, we should return NamespaceExists as the ordinary rename command would. Using opCtx->writesAreReplicated() for the distinction is consistent with how we treat a "renameOpTime" in applyOps. |
| Comments |
| Comment by David Golden [ 26/Sep/19 ] |
|
I agree it's a legitimate bug, but I don't know how important it is to fix compared to other priorities, so I have no opinion on whether or not to close. If it's scheduled, we just need to schedule a mongomirror ticket to account for it, as with do with other applyOps changes. |
| Comment by Judah Schvimer [ 26/Sep/19 ] |
|
david.golden, am I correct that you would like us to close this "Won't Fix"? |
| Comment by David Golden [ 25/Sep/19 ] |
|
Solution #2 will cause problems for mongomirror, as it won't know how to handle a failure at 6. Though on reflection, I'm not sure the scenario is actually realistic. Oplog tailing and cloning on node 2 would happen between steps 1 & 2 and the collection list would only include A with UUID1, plus seeing any renames in the oplog during initial sync are immediate failures anyway (because we can't query by UUID yet). However, during tailing, assuming mongomirror is not preserving UUIDs, IIUC any rename would run into this problem because renaming onto an existing target (presumably, with dropTarget true) still wouldn't have the right target UUID for the drop. So in this case, the server erroring would mean mongomirror would need to catch the error, send a remove (by name) to the destination, then resend the rename op. So if this change is made, we'll need a mongomirror ticket in the project's 4.4 (or whatever version) epic. |
| Comment by Matthew Russotto [ 24/Sep/19 ] |
|
Yes, when we apply a rename operation that includes a UUID, the source collection name is ignored. Will solution #2 cause a problem with mongomirror? |
| Comment by David Golden [ 24/Sep/19 ] |
|
Based on this scenario, I'm fine with solution #1 (do nothing). |
| Comment by David Golden [ 24/Sep/19 ] |
|
When you say "which renames the temporary to another name" – you recognize this via the UUID (ignoring the name mismatch)? |
| Comment by Matthew Russotto [ 24/Sep/19 ] |
|
The situation occurs when applying a rename operation (using applyOps) which renames a collection, and the target collection exists but either is not expected to or does not have the expected UUID. This can happen if the oplog is being replayed during recovery, or it can happen in a scenario where data has been copied and then the oplog applied (as in initial sync and mongomirror). Since collection names must be unique, we handle this by renaming the unexpected collection to some random temporary name. We expect that in any valid scenario, we will soon get an oplog entry which renames the temporary to another name, or deletes it. If we do not get such an oplog entry, and this oplog entry gets replicated, we will end up with temporary collections with different names on both primary and secondary, which results in a dbHash mismatch. The options on the table are 1) Do nothing. This means users can cause dbHash mismatches using applyOps, but that's already true. 2) Prevent applying such oplog entries on a running primary. In this case, a scenario that causes a rename conflict will fail. Example: On Node 1 On Node 2: With the current code this works; with the proposed solution, if node 2 is primary, this would fail at step 6 with NamespaceExists. |
| Comment by David Golden [ 23/Sep/19 ] |
|
It's a little hard to follow. Can you give me some concrete examples of the different options, please? |
| Comment by Judah Schvimer [ 23/Sep/19 ] |
|
david.golden, we think this is something that mongomirror would not be affected by, since future operations should fix everything up. Do you have any concerns about us not prioritizing this? We've generally been punting on issues of applyOps not validating its inputs. |