Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-39587

Include the final collection name in each oplog entry for commands using temporary collections

    • Type: Icon: Improvement Improvement
    • Resolution: Won't Do
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: Storage
    • Labels:
    • Storage Execution

      cross-DB renameCollection, mapreduce, and aggregate with $out all write to temporary collections that are eventually renamed after all the data has been inserted. 

      For example, as of SERVER-30371, a collection rename across databases generates these oplog entries: 

      • a create on the final database for a temporary collection name
      • insert s to the temporary collection for all the documents
      • a renameCollection from the temp collection to the final name
      • a drop for the original collection name 

      Looking at the create and insert operations, you cannot tell what the collection's final name will be.

      The oplog entries look similar for mapreduce and aggregation with $out.

      It would be helpful to mongomirror if the oplog entries included the final namespace.

      The reason is that MGOMIRROR-37 introduces the ability to only migrate/sync a subset of DBs and/or collections.

      Looking at the create and insert oplog entries on the temporary collection, we cannot tell if the final namespace matches the user provided filter, and if we should be applying the ops or not.
      For now, we can handle this by always applying entries for temporary collections on a database that is fully/partially included in the filter, and then dropping the temporary namespace if we get to the renameCollection op and it turns out the final name is not one we are mirroring. But that ends up being a lot of extra work for mongomirror. Additionally, it requires us to hard-code in what format temporary collection names are in, which seems risky if there is any chance those could change in future server versions. If the create and insert had a field indicating the final namespace, that would

      a) tell us that the op is definitely on a temp collection (this would continue to work even if the temp name format changed), and

      b) allow us to determine whether we care about the final namespace and should apply the op or skip it

            backlog-server-execution Backlog - Storage Execution Team
            kaitlin.mahar@mongodb.com Kaitlin Mahar
            0 Vote for this issue
            12 Start watching this issue