Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-23274

Collections created with the $out aggregation pipeline in MongoDB 3.2 get dropped on replica set election

    • Fully Compatible
    • ALL
    • Hide

      example:

      SOURCE (collection)
      { "_id" : 1, "srcindex" : 1, "N" : 1 }
      { "_id" : 2, "srcindex" : 2, "N" : 2 }
      { "_id" : 3, "srcindex" : 3, "N" : 3 }
      

      command:

      db.SOURCE.aggregate([
        { $group: { "_id" : { "B" : '$B', "CCG" : '$srcindex' }, "N" : { "$sum" : "$N" } } },
        { $out: "byChapter" }
      ]);
      

      gives:

      { "_id" : { "CCG" : 3 }, "N" : 3 }
      { "_id" : { "CCG" : 2 }, "N" : 2 }
      { "_id" : { "CCG" : 1 }, "N" : 1 }
      

      All replicasets give the same result, if I add another item into the collection it persists across all recordsets.

      If I then create a new collection with a simple

      db.another.insert({})
      

      that is also present.

      now:

      rs.stepDown()
      

      machines switch around, and my primary steps down. I get this logging:

      2016-03-21T19:39:54.876+0000 I COMMAND  [conn349] Attempting to step down in response to replSetStepDown command
      2016-03-21T19:39:54.876+0000 I REPL     [ReplicationExecutor] transition to SECONDARY
      2016-03-21T19:39:54.876+0000 I NETWORK  [conn353] end connection ?.?.?.? (18 connections now open)
      ...
      2016-03-21T19:39:54.878+0000 I NETWORK  [conn349] SocketException handling request, closing client connection: 9001 socket exception [SEND_ERROR] server [127.0.0.1:58627] 
      2016-03-21T19:39:54.885+0000 I NETWORK  [initandlisten] connection accepted from 127.0.0.1:58645 #373 (5 connections now open)
      2016-03-21T19:39:55.180+0000 I REPL     [ReplicationExecutor] replSetElect voting yea for R3:27017 (2)
      2016-03-21T19:39:55.897+0000 I NETWORK  [conn366] SocketException handling request, closing client connection: 9001 socket exception [SEND_ERROR] server [?.?.?.?:59354] 
      2016-03-21T19:39:55.897+0000 I NETWORK  [conn369] SocketException handling request, closing client connection: 9001 socket exception [SEND_ERROR] server [?.?.?.?:63795] 
      2016-03-21T19:39:56.525+0000 I REPL     [ReplicationExecutor] Member R3:27017 is now in state PRIMARY
      2016-03-21T19:39:57.491+0000 I REPL     [ReplicationExecutor] syncing from: R3:27017
      2016-03-21T19:39:57.498+0000 I NETWORK  [SyncSourceFeedback] Socket say send() errno:10038 An operation was attempted on something that is not a socket. ?.?.?.?:27017
      2016-03-21T19:39:57.498+0000 I REPL     [SyncSourceFeedback] SyncSourceFeedback error sending update: socket exception [SEND_ERROR] for ?.?.?.?:27017
      2016-03-21T19:39:57.499+0000 I REPL     [SyncSourceFeedback] updateUpstream failed: Location9001: socket exception [SEND_ERROR] for ?.?.?.?:27017, will retry
      2016-03-21T19:39:57.504+0000 I ASIO     [NetworkInterfaceASIO-0] Successfully connected to R3:27017
      2016-03-21T19:39:57.510+0000 I COMMAND  [repl writer worker 15] CMD: drop data_1601.byChapter
      2016-03-21T19:39:57.513+0000 I REPL     [ReplicationExecutor] could not find member to sync from
      
      ----
      

      So the aggregated out collection is dropped, but the inserted one is not.

      Show
      example: SOURCE (collection) { "_id" : 1, "srcindex" : 1, "N" : 1 } { "_id" : 2, "srcindex" : 2, "N" : 2 } { "_id" : 3, "srcindex" : 3, "N" : 3 } command: db.SOURCE.aggregate([ { $group: { "_id" : { "B" : '$B', "CCG" : '$srcindex' }, "N" : { "$sum" : "$N" } } }, { $out: "byChapter" } ]); gives: { "_id" : { "CCG" : 3 }, "N" : 3 } { "_id" : { "CCG" : 2 }, "N" : 2 } { "_id" : { "CCG" : 1 }, "N" : 1 } All replicasets give the same result, if I add another item into the collection it persists across all recordsets. If I then create a new collection with a simple db.another.insert({}) that is also present. now: rs.stepDown() machines switch around, and my primary steps down. I get this logging: 2016-03-21T19:39:54.876+0000 I COMMAND [conn349] Attempting to step down in response to replSetStepDown command 2016-03-21T19:39:54.876+0000 I REPL [ReplicationExecutor] transition to SECONDARY 2016-03-21T19:39:54.876+0000 I NETWORK [conn353] end connection ?.?.?.? (18 connections now open) ... 2016-03-21T19:39:54.878+0000 I NETWORK [conn349] SocketException handling request, closing client connection: 9001 socket exception [SEND_ERROR] server [127.0.0.1:58627] 2016-03-21T19:39:54.885+0000 I NETWORK [initandlisten] connection accepted from 127.0.0.1:58645 #373 (5 connections now open) 2016-03-21T19:39:55.180+0000 I REPL [ReplicationExecutor] replSetElect voting yea for R3:27017 (2) 2016-03-21T19:39:55.897+0000 I NETWORK [conn366] SocketException handling request, closing client connection: 9001 socket exception [SEND_ERROR] server [?.?.?.?:59354] 2016-03-21T19:39:55.897+0000 I NETWORK [conn369] SocketException handling request, closing client connection: 9001 socket exception [SEND_ERROR] server [?.?.?.?:63795] 2016-03-21T19:39:56.525+0000 I REPL [ReplicationExecutor] Member R3:27017 is now in state PRIMARY 2016-03-21T19:39:57.491+0000 I REPL [ReplicationExecutor] syncing from: R3:27017 2016-03-21T19:39:57.498+0000 I NETWORK [SyncSourceFeedback] Socket say send() errno:10038 An operation was attempted on something that is not a socket. ?.?.?.?:27017 2016-03-21T19:39:57.498+0000 I REPL [SyncSourceFeedback] SyncSourceFeedback error sending update: socket exception [SEND_ERROR] for ?.?.?.?:27017 2016-03-21T19:39:57.499+0000 I REPL [SyncSourceFeedback] updateUpstream failed: Location9001: socket exception [SEND_ERROR] for ?.?.?.?:27017, will retry 2016-03-21T19:39:57.504+0000 I ASIO [NetworkInterfaceASIO-0] Successfully connected to R3:27017 2016-03-21T19:39:57.510+0000 I COMMAND [repl writer worker 15] CMD: drop data_1601.byChapter 2016-03-21T19:39:57.513+0000 I REPL [ReplicationExecutor] could not find member to sync from ---- So the aggregated out collection is dropped, but the inserted one is not.
    • Query 12 (04/04/16)
    • 0

      .

      Issue Status as of Apr 14, 2016

      ISSUE SUMMARY
      On MongoDB 3.2, collections created using the $out operator in the aggregation pipeline are incorrectly marked as temporary collections.

      In a replica set, when an election takes place, all temporary collections are removed from the dataset.

      USER IMPACT
      All collections created with MongoDB 3.2 via an aggregation pipeline are removed if there's an election on a replica set.

      These collections must be re-created by re-running the aggregation pipeline used to create them originally. To prevent them from being dropped again please see the WORKAROUNDS section below or upgrade to MongoDB 3.2.5.

      WORKAROUNDS
      After an aggregation command has successfully finished and created a collection (e.g.: agg_out), users can rename the collection with the renameCollection command to avoid running into this issue:

      use admin
      db.runCommand( { renameCollection: "dbname.created_with_$out", to: "dbname.some_other_name" } )
      

      Upon renaming the collection, its temporary flag is cleared, so a future replica set election will not drop the collection. Note that it's easy to restore the required name by executing another renameCollection command.

      AFFECTED VERSIONS
      Only collections created in MongoDB 3.2 via the $out operator from the aggregation pipeline are affected by this issue.

      Collections created using earlier versions of MongoDB that are now hosted on a MongoDB 3.2 replica set are not affected by this issue.

      FIX VERSION
      The fix is included in the 3.2.5 production release.

      Original description

      Any collection created using an aggregate operation will be dropped when the resultset steps down.

      I thought it was todo with lookup, but after removing that pipe, I find that it is all aggregations.

            Assignee:
            benjamin.murphy Benjamin Murphy
            Reporter:
            paul.reed Paul Reed
            Votes:
            0 Vote for this issue
            Watchers:
            21 Start watching this issue

              Created:
              Updated:
              Resolved: