[SERVER-30371] Separate renameCollection across DB commands into individual oplog entries Created: 27/Jul/17  Updated: 30/Oct/23  Resolved: 06/Sep/17

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: Backlog
Fix Version/s: 3.5.13

Type: Task Priority: Major - P3
Reporter: Allison Chang Assignee: Benety Goh
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
depends on SERVER-30900 remove collMod writeConcern argument ... Closed
depends on SERVER-30908 ReplSetTest.checkOplogs() returns err... Closed
is depended on by SERVER-30798 Disallow running applyOps with a rena... Backlog
is depended on by SERVER-30381 Remove handing for rolling back dropS... Closed
is depended on by SERVER-30382 Create new oplog entry for handling c... Closed
Related
related to SERVER-30383 Preserve collection UUID in renameCol... Closed
related to SERVER-39587 Include the final collection name in ... Closed
is related to SERVER-28285 renameCollection should only generate... Closed
is related to SERVER-30948 Add protection against releasing lock... Closed
is related to SERVER-30212 Use two phase drop for renameCollecti... Closed
Backwards Compatibility: Fully Compatible
Sprint: Repl 2017-08-21, Repl 2017-09-11
Participants:
Linked BF Score: 0

 Description   

Currently, when we rename across databases, a problem arises with rollback when we are trying to refetch a document in a collection that has been renamed across databases. Even if we are querying by UUID's, because the collection has been copied into another database, this action makes the UUID of the collection to change. Thus, although the document does exist, just under a different namespace and UUID, we will not be able to refetch it during rollback. This leads to data corruption between the sync source and rolling back node.

A fix for this would be to make the oplog entry for renameCollection a set of create, insert and delete oplog entries instead of a singular oplog entry. This means that even if during rollback, we cannot refetch the document by UUID, when the node has transitioned out of roll back state and into secondary state, we can apply the insertions and maintain consistency.



 Comments   
Comment by Benety Goh [ 06/Sep/17 ]

Author:

{'username': 'benety', 'name': 'Benety Goh', 'email': 'benety@mongodb.com'}

Message: SERVER-30371 remove dropSource from renameCollection oplog entry format
Branch: master
https://github.com/mongodb/mongo/commit/c5b7cbe971635a5fb71cd3d628189ee328284df3

Comment by Githook User [ 06/Sep/17 ]

Author:

{'username': 'benety', 'name': 'Benety Goh', 'email': 'benety@mongodb.com'}

Message: SERVER-30371 apply_ops_idempotency.js waits for drop-pending collections in all test databases to be removed
Branch: master
https://github.com/mongodb/mongo/commit/e3f3b1357bb2838140f6e9a42ff704237d68afbe

Comment by Githook User [ 06/Sep/17 ]

Author:

{'username': 'benety', 'name': 'Benety Goh', 'email': 'benety@mongodb.com'}

Message: SERVER-30371 repl9.js waits for source collection to be dropped
Branch: master
https://github.com/mongodb/mongo/commit/d9058667133b60f21bbd887ca03a87948b070656

Comment by Githook User [ 06/Sep/17 ]

Author:

{'username': 'benety', 'name': 'Benety Goh', 'email': 'benety@mongodb.com'}

Message: SERVER-30371 add namespace to update error message
Branch: master
https://github.com/mongodb/mongo/commit/49621e733bfbdd42a57ae5b63bf018dcd8bd07a4

Comment by Githook User [ 06/Sep/17 ]

Author:

{'username': 'benety', 'name': 'Benety Goh', 'email': 'benety@mongodb.com'}

Message: SERVER-30371 downgrade global write lock when renaming across databases

This is done after creating the temporary collection and indexes and before
copying the documents from the source collection.
Branch: master
https://github.com/mongodb/mongo/commit/fa2f40a44ea649b801bfa3ba2bbeb0d36020629c

Comment by Githook User [ 06/Sep/17 ]

Author:

{'username': 'benety', 'name': 'Benety Goh', 'email': 'benety@mongodb.com'}

Message: SERVER-30371 renaming collection across databases logs individual oplog entries
Branch: master
https://github.com/mongodb/mongo/commit/f85b99b90e754efa8d806d6124c902447ddc7481

Comment by Githook User [ 06/Sep/17 ]

Author:

{'username': 'benety', 'name': 'Benety Goh', 'email': 'benety@mongodb.com'}

Message: SERVER-30371 added js tests for renaming a collection across databases
Branch: master
https://github.com/mongodb/mongo/commit/27b73dd0616a7e6c4956989f753b3ade4291cdd6

Comment by Githook User [ 05/Sep/17 ]

Author:

{'username': 'benety', 'name': 'Benety Goh', 'email': 'benety@mongodb.com'}

Message: Revert "SERVER-30371 renaming collection across databases logs individual oplog entries"

This reverts commit 42bfda31eae322ac190e0c8cd831ca73f6e78f18.
Branch: master
https://github.com/mongodb/mongo/commit/5b0bf41d2305f684386332ee96d6a96a77f70e7f

Comment by Githook User [ 05/Sep/17 ]

Author:

{'username': 'benety', 'name': 'Benety Goh', 'email': 'benety@mongodb.com'}

Message: SERVER-30371 renaming collection across databases logs individual oplog entries
Branch: master
https://github.com/mongodb/mongo/commit/42bfda31eae322ac190e0c8cd831ca73f6e78f18

Comment by Githook User [ 05/Sep/17 ]

Author:

{'username': 'benety', 'name': 'Benety Goh', 'email': 'benety@mongodb.com'}

Message: SERVER-30371 add tests for downgrading global lock from MODE_X to MODE_IX
Branch: master
https://github.com/mongodb/mongo/commit/4db291425761c1b557f10b643242ed08eb542df7

Comment by Githook User [ 31/Aug/17 ]

Author:

{'username': 'benety', 'name': 'Benety Goh', 'email': 'benety@mongodb.com'}

Message: SERVER-30371 renameCollection() across databases returns InvalidLength if source collection's indexes are too long for temporary collection
Branch: master
https://github.com/mongodb/mongo/commit/2d568c4ddbe9065d92a5f0443d0c65c8f3a62a87

Comment by Githook User [ 30/Aug/17 ]

Author:

{'name': 'Benety Goh', 'username': 'benety', 'email': 'benety@mongodb.com'}

Message: SERVER-30371 rename across database does not make target collection temporary if source collection was not temporary
Branch: master
https://github.com/mongodb/mongo/commit/88ef24561ef69ac7756b80256a86515180b830a3

Comment by Githook User [ 29/Aug/17 ]

Author:

{'email': 'benety@mongodb.com', 'username': 'benety', 'name': 'Benety Goh'}

Message: SERVER-30371 disable UUID on temporary collection when renaming across databases if source collection does not contain a UUID
Branch: master
https://github.com/mongodb/mongo/commit/d810265d7b99e2137ab6c50c38a43d8dc3c6d0a4

Comment by Githook User [ 28/Aug/17 ]

Author:

{'username': 'benety', 'name': 'Benety Goh', 'email': 'benety@mongodb.com'}

Message: SERVER-30371 UUIDCatalog::onCreateCollection() always replaces existing entry for uuid
Branch: master
https://github.com/mongodb/mongo/commit/c7f224509f5e8f5dc5d138f02c507bd84a52a274

Comment by Spencer Brody (Inactive) [ 27/Jul/17 ]

Spoke with geert.bosch just now and we agreed that #2 doesn't actually work, since the sync source can do a rename across databases after the sync target already found the common point, and the sync target would have no way to detect or handle that.

For #3 to work we'd have to record persistently somewhere that we in rollback and encountered a rename, so that if we crashed we'd know during startup recovery that we need to refetch any collections that we see a cross-db renameCollection entry for. This makes startup recovery rely even more on having a sync source available, something we've been trying to do less.

After speaking with redbeard0531 we came up with another approach, described in SERVER-30383, which I think is the cleanest solution. If that's not viable, then I think we need to go with the first solution of expanding cross-db renameCollections into its component writes and replicate it that way.

Comment by Geert Bosch [ 27/Jul/17 ]

To clarify, the situation above will happen for nodes in the minority that need to rollback if the majority performed a rename across databases since the common point in history.

I see three solutions:

Solution Pros Cons
Expand rename across databases Simplest Causes massive oplog traffic
    Hard to revert after when no longer needed
Find set of UUID aliases from oplog Avoids oplog/network traffic in steady state More complex rollback
  Preserves behavior / atomicity  
Refetch for rename during recovery syncApply Avoids oplog/network traffic in steady state Needs renamed coll refetch during recovery
  Preserves behavior / atomicity Slower recovery

For the second solution, we'd track what UUIDs a collection that was renamed across DBs has had when we scan the oplog of our sync source for the common point, so we can rollback updates to that collection by fetching the document from any of these UUIDs

For the third solution, we ignore updates to collections that were renamed across databases just like we drop updates to any other collection that was dropped by the majority. Howvever, now during recovery oplog application (non-steady state), we need to refetch collections that were renamed across databases, as these will otherwise not restore to a consistent state. This refetch will push the minValid time forward more making recovery slower.

Comment by Andy Schwerin [ 27/Jul/17 ]

This is only for renames across databases?

Comment by Allison Chang [ 27/Jul/17 ]

spencer schwerin judah.schvimer geert.bosch benety.goh

Generated at Thu Feb 08 04:23:35 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.