[SERVER-23274] Collections created with the $out aggregation pipeline in MongoDB 3.2 get dropped on replica set election Created: 21/Mar/16 Updated: 28/Aug/18 Resolved: 24/Mar/16 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | 3.1.9 |
| Fix Version/s: | 3.2.5, 3.3.4 |
| Type: | Bug | Priority: | Critical - P2 |
| Reporter: | Paul Reed | Assignee: | Benjamin Murphy |
| Resolution: | Done | Votes: | 0 |
| Labels: | code-and-test | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||||||||||||||
| Operating System: | ALL | ||||||||||||||||||||||||||||||||
| Backport Completed: | |||||||||||||||||||||||||||||||||
| Steps To Reproduce: | example:
command:
gives:
All replicasets give the same result, if I add another item into the collection it persists across all recordsets. If I then create a new collection with a simple
that is also present. now:
machines switch around, and my primary steps down. I get this logging:
So the aggregated out collection is dropped, but the inserted one is not. |
||||||||||||||||||||||||||||||||
| Sprint: | Query 12 (04/04/16) | ||||||||||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||||||||||
| Case: | (copied to CRM) | ||||||||||||||||||||||||||||||||
| Linked BF Score: | 0 | ||||||||||||||||||||||||||||||||
| Description |
|
. Issue Status as of Apr 14, 2016 ISSUE SUMMARY In a replica set, when an election takes place, all temporary collections are removed from the dataset. USER IMPACT These collections must be re-created by re-running the aggregation pipeline used to create them originally. To prevent them from being dropped again please see the WORKAROUNDS section below or upgrade to MongoDB 3.2.5. WORKAROUNDS
Upon renaming the collection, its temporary flag is cleared, so a future replica set election will not drop the collection. Note that it's easy to restore the required name by executing another renameCollection command. AFFECTED VERSIONS Collections created using earlier versions of MongoDB that are now hosted on a MongoDB 3.2 replica set are not affected by this issue. FIX VERSION Original descriptionAny collection created using an aggregate operation will be dropped when the resultset steps down. I thought it was todo with lookup, but after removing that pipe, I find that it is all aggregations. |
| Comments |
| Comment by Adam Schwartz [ 04/Jul/16 ] |
|
emoshe, I am sorry to hear this bug caused you significant data loss. We will work with you via a support ticket to minimize the impact and help you recover the data if possible. Please understand that we review every server bug and assess if it warrants a Critical Advisory. In this case, we decided not issue an advisory. We fixed the bug, wrote a detailed summary (which describes impact and workarounds), added a note in the release announcement, and notified our support team. We will re-assess our prioritization and alert processes in light of your feedback. We appreciate the many JIRA tickets you have opened and contributed to over the past couple of years. Your feedback helps us improve and become more responsive to customer needs. |
| Comment by Elad Moshe [ 03/Jul/16 ] |
|
Unfortunately, we were affected by this bug, which caused us significant data loss. |
| Comment by Ramon Fernandez Marina [ 14/Apr/16 ] |
|
paul.reed, this is to let you know that we've just released the next version in the 3.2 series, 3.2.5, containing a fix for this bug; it's available for download here. Please note that published releases can't be modified, so it is not possible to fix this issue in versions 3.2.4 down to 3.2.0 (3.0.x and 2.6.x versions are not affected) – users affected by this bug should upgrade to 3.2.5 as soon as possible. Thanks again for reporting the issue. Cheers, |
| Comment by Paul Reed [ 25/Mar/16 ] |
|
Seems a pretty big issue to not be retrofitted for 3.2.4 and further back. At the very least highlight the issue as a known and dangerous ! Paul |
| Comment by Ramon Fernandez Marina [ 25/Mar/16 ] |
|
paul.reed, 3.2.4 was already released, so as Benjamin pointed out the first stable release to include a fix will be 3.2.5, currently scheduled for mid-April. As for your last question above, this is a manifestation of the same issue: setting dropTarget to true triggers the bug, so to work around it you'll need to issue the renameCollection with dropTarget set to false (or omitted, since false is the default). Regards, |
| Comment by Benjamin Murphy [ 25/Mar/16 ] |
|
paul.reed, it will be part of 3.2.5, as well as 3.3.4, which is a development release. Both are in the works! In the meantime, the workaround you identified will serve to prevent this from happening to a collection created with $out. |
| Comment by Paul Reed [ 24/Mar/16 ] |
|
Will this fix now be within 3.2.4 ? |
| Comment by Githook User [ 24/Mar/16 ] |
|
Author: {u'username': u'benjaminmurphy', u'name': u'Benjamin Murphy', u'email': u'benjamin_murphy@me.com'}Message: [cherry-picked from commit a19406fdedac2bff515a0b162c8d496b11f4e455] |
| Comment by Githook User [ 24/Mar/16 ] |
|
Author: {u'username': u'benjaminmurphy', u'name': u'Benjamin Murphy', u'email': u'benjamin_murphy@me.com'}Message: |
| Comment by Paul Reed [ 24/Mar/16 ] |
|
I am getting another funny with this. So when I aggregate using c# driver. collection.Aggregate().Group($"{{ _id:{{ {idclause} }} }}").Project($"{{ {project} }}").Out(outCollection+"_AGGFIX");collection.Database.RenameCollection(outCollection + "_AGGFIX", outCollection, new RenameCollectionOptions() { DropTarget = true }); when I step down - the outCollection gets dropped. If I run this instead collection.Aggregate().Group($"{{ _id:{{ {idclause} }} }}").Project($"{{ {project} }}").Out(outCollection+"_AGGFIX"); In both cases the outCollection does not exist prior to operation. Is this the same issue ? Same fix ? |
| Comment by Paul Reed [ 24/Mar/16 ] |
|
No problem Is there a scenario, with say certain machine rotations, which would have cleared the erroneous drop? Wondering why I or no-one else had spotted this earlier. btw, there is nothing quite as horrid as watching logs go by which proceed to drop a 30 hour aggregation process in a matter of seconds. |
| Comment by Ramon Fernandez Marina [ 23/Mar/16 ] |
|
paul.reed, this is to let you know that we've identified the source of the problem and a fix is on code review now. As you already found out, renaming the collection created by the aggregation pipeline is a suitable workaround to prevent it from being dropped. Thanks for reporting this issue. |
| Comment by Paul Reed [ 21/Mar/16 ] |
|
Also: renaming the collection prior to stepDown - will prevent the erroneous drop. |
| Comment by Ramon Fernandez Marina [ 21/Mar/16 ] |
|
Thanks for your report paul.reed, we can reproduce this behavior and we're investigating. |