[SERVER-30371] Separate renameCollection across DB commands into individual oplog entries Created: 27/Jul/17 Updated: 30/Oct/23 Resolved: 06/Sep/17 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | Backlog |
| Fix Version/s: | 3.5.13 |
| Type: | Task | Priority: | Major - P3 |
| Reporter: | Allison Chang | Assignee: | Benety Goh |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||||||||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||||||||||||||||||||||||||||||
| Sprint: | Repl 2017-08-21, Repl 2017-09-11 | ||||||||||||||||||||||||||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||||||||||||||||||||||||||
| Linked BF Score: | 0 | ||||||||||||||||||||||||||||||||||||||||||||||||
| Description |
|
Currently, when we rename across databases, a problem arises with rollback when we are trying to refetch a document in a collection that has been renamed across databases. Even if we are querying by UUID's, because the collection has been copied into another database, this action makes the UUID of the collection to change. Thus, although the document does exist, just under a different namespace and UUID, we will not be able to refetch it during rollback. This leads to data corruption between the sync source and rolling back node. A fix for this would be to make the oplog entry for renameCollection a set of create, insert and delete oplog entries instead of a singular oplog entry. This means that even if during rollback, we cannot refetch the document by UUID, when the node has transitioned out of roll back state and into secondary state, we can apply the insertions and maintain consistency. |
| Comments |
| Comment by Benety Goh [ 06/Sep/17 ] | |||||||||||||||||||||
|
Author: {'username': 'benety', 'name': 'Benety Goh', 'email': 'benety@mongodb.com'}Message: | |||||||||||||||||||||
| Comment by Githook User [ 06/Sep/17 ] | |||||||||||||||||||||
|
Author: {'username': 'benety', 'name': 'Benety Goh', 'email': 'benety@mongodb.com'}Message: | |||||||||||||||||||||
| Comment by Githook User [ 06/Sep/17 ] | |||||||||||||||||||||
|
Author: {'username': 'benety', 'name': 'Benety Goh', 'email': 'benety@mongodb.com'}Message: | |||||||||||||||||||||
| Comment by Githook User [ 06/Sep/17 ] | |||||||||||||||||||||
|
Author: {'username': 'benety', 'name': 'Benety Goh', 'email': 'benety@mongodb.com'}Message: | |||||||||||||||||||||
| Comment by Githook User [ 06/Sep/17 ] | |||||||||||||||||||||
|
Author: {'username': 'benety', 'name': 'Benety Goh', 'email': 'benety@mongodb.com'}Message: This is done after creating the temporary collection and indexes and before | |||||||||||||||||||||
| Comment by Githook User [ 06/Sep/17 ] | |||||||||||||||||||||
|
Author: {'username': 'benety', 'name': 'Benety Goh', 'email': 'benety@mongodb.com'}Message: | |||||||||||||||||||||
| Comment by Githook User [ 06/Sep/17 ] | |||||||||||||||||||||
|
Author: {'username': 'benety', 'name': 'Benety Goh', 'email': 'benety@mongodb.com'}Message: | |||||||||||||||||||||
| Comment by Githook User [ 05/Sep/17 ] | |||||||||||||||||||||
|
Author: {'username': 'benety', 'name': 'Benety Goh', 'email': 'benety@mongodb.com'}Message: Revert " This reverts commit 42bfda31eae322ac190e0c8cd831ca73f6e78f18. | |||||||||||||||||||||
| Comment by Githook User [ 05/Sep/17 ] | |||||||||||||||||||||
|
Author: {'username': 'benety', 'name': 'Benety Goh', 'email': 'benety@mongodb.com'}Message: | |||||||||||||||||||||
| Comment by Githook User [ 05/Sep/17 ] | |||||||||||||||||||||
|
Author: {'username': 'benety', 'name': 'Benety Goh', 'email': 'benety@mongodb.com'}Message: | |||||||||||||||||||||
| Comment by Githook User [ 31/Aug/17 ] | |||||||||||||||||||||
|
Author: {'username': 'benety', 'name': 'Benety Goh', 'email': 'benety@mongodb.com'}Message: | |||||||||||||||||||||
| Comment by Githook User [ 30/Aug/17 ] | |||||||||||||||||||||
|
Author: {'name': 'Benety Goh', 'username': 'benety', 'email': 'benety@mongodb.com'}Message: | |||||||||||||||||||||
| Comment by Githook User [ 29/Aug/17 ] | |||||||||||||||||||||
|
Author: {'email': 'benety@mongodb.com', 'username': 'benety', 'name': 'Benety Goh'}Message: | |||||||||||||||||||||
| Comment by Githook User [ 28/Aug/17 ] | |||||||||||||||||||||
|
Author: {'username': 'benety', 'name': 'Benety Goh', 'email': 'benety@mongodb.com'}Message: | |||||||||||||||||||||
| Comment by Spencer Brody (Inactive) [ 27/Jul/17 ] | |||||||||||||||||||||
|
Spoke with geert.bosch just now and we agreed that #2 doesn't actually work, since the sync source can do a rename across databases after the sync target already found the common point, and the sync target would have no way to detect or handle that. For #3 to work we'd have to record persistently somewhere that we in rollback and encountered a rename, so that if we crashed we'd know during startup recovery that we need to refetch any collections that we see a cross-db renameCollection entry for. This makes startup recovery rely even more on having a sync source available, something we've been trying to do less. After speaking with redbeard0531 we came up with another approach, described in | |||||||||||||||||||||
| Comment by Geert Bosch [ 27/Jul/17 ] | |||||||||||||||||||||
|
To clarify, the situation above will happen for nodes in the minority that need to rollback if the majority performed a rename across databases since the common point in history. I see three solutions:
For the second solution, we'd track what UUIDs a collection that was renamed across DBs has had when we scan the oplog of our sync source for the common point, so we can rollback updates to that collection by fetching the document from any of these UUIDs For the third solution, we ignore updates to collections that were renamed across databases just like we drop updates to any other collection that was dropped by the majority. Howvever, now during recovery oplog application (non-steady state), we need to refetch collections that were renamed across databases, as these will otherwise not restore to a consistent state. This refetch will push the minValid time forward more making recovery slower. | |||||||||||||||||||||
| Comment by Andy Schwerin [ 27/Jul/17 ] | |||||||||||||||||||||
|
This is only for renames across databases? | |||||||||||||||||||||
| Comment by Allison Chang [ 27/Jul/17 ] | |||||||||||||||||||||