[SERVER-36832] Allow $out to different database Created: 23/Aug/18  Updated: 06/Dec/22  Resolved: 17/Mar/20

Status: Closed
Project: Core Server
Component/s: Aggregation Framework
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Kyle Suarez Assignee: Backlog - Query Team (Inactive)
Resolution: Duplicate Votes: 0
Labels: open_todo_in_code
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
depends on SERVER-46110 expose $_internalOutToDifferentDB fun... Closed
depends on SERVER-13201 Allow new Aggregation $merge stage to... Closed
Duplicate
duplicates SERVER-46110 expose $_internalOutToDifferentDB fun... Closed
Assigned Teams:
Query
Participants:

 Description   

4.2 added the $merge stage which does support outputting to a different database. It's a little trickier to add support for $out to output to a different database. In a sharded environment the target collection could live on a different shard.

Original Description

In SERVER-13201, we discovered that a $out to a different database with mode "replaceCollection" will not work properly in a sharded cluster when the primary shard of the source database differs from the primary shard of the output database. If they are different, DocumentSourceOutReplaceCollection will perform the rename operation on the primary shard of the source, rather than the primary shard of the output.

We will prohibit $out with mode "replaceCollection" to a foreign database in SERVER-13201. This ticket will then track the work to get the metadata operations working properly in a sharded cluster.



 Comments   
Comment by Ted Tuckman [ 17/Mar/20 ]

This was done as part of closing the gap between map reduce and aggregation in SERVER-46110, so closing this as duplicate. This is allowed as of 4.4.

Comment by Charlie Swanson [ 04/Jun/19 ]

nicholas.zolnierz I adjusted the title/description so I think this makes more sense now?

Comment by Asya Kamsky [ 11/Sep/18 ]

kyle.suarez it’s not very important - in fact it’s okay to drop that.

Comment by Kyle Suarez [ 11/Sep/18 ]

After much discussion with esha.maharishi, nicholas.zolnierz, charlie.swanson and david.storch, we've decided that the code that would be written to support this would be duplicating work planned for the "All collections are sharded" Sharding Team project.

asya, how important is it that we support "replaceCollection" mode to a foreign database? Important enough that it must land in 4.2? If so, we believe the correct move is to then prioritize the dependencies in the Sharding Team project appropriately and then revisit this ticket when that time comes.

Comment by Kyle Suarez [ 05/Sep/18 ]

When we do this, friendly reminder to update the authz tests in jstests/aggregation/sources/out/bypass_doc_validation.js and jstests/auth/lib/commands_lib.js as well.

Comment by Kyle Suarez [ 27/Aug/18 ]

My tests for the positive version in SERVER-13201 that we could reuse for this ticket when it's working:

jstests/aggregation/sources/out/mode_replace_collection.js

    //
    // Tests for $out to a database that differs from the aggregation database.
    //
    const foreignDb = db.getSiblingDB("mode_replace_collection_foreign");
    const foreignTargetColl = foreignDb.mode_replace_collection_out;
    const pipelineDifferentOutputDb = [{
        $out: {
            to: foreignTargetColl.getName(),
            db: foreignDb.getName(),
            mode: "replaceCollection",
        }
    }];
 
    foreignDb.dropDatabase();
    coll.drop();
    assert.commandWorked(coll.insert({_id: 0}));
 
    if (!FixtureHelpers.isMongos(db)) {
        // Test that $out implicitly creates a new database when the output collection's database
        // doesn't exist.
        assert.doesNotThrow(() => coll.aggregate(pipelineDifferentOutputDb));
        assert.eq(foreignTargetColl.find().itcount(), 1);
 
        // Change the contents of the source collection and test that running the same aggregation
        // will blow away the contents of the old collection and replace them with new documents.
        coll.drop();
        assert.commandWorked(coll.insert({_id: 1}));
        assert.doesNotThrow(() => coll.aggregate(pipelineDifferentOutputDb));
        assert.eq(foreignTargetColl.find().toArray(), [{_id: 1}]);
    } else {
        // Implicit database creation is prohibited in a cluster.
        let error = assert.throws(() => coll.aggregate(pipelineDifferentOutputDb));
        assert.commandFailedWithCode(error, ErrorCodes.NamespaceNotFound);
    }

Generated at Thu Feb 08 04:44:13 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.