[SERVER-36737] Aggregating on a collection that is rebuilt using $out will sometimes raise an exception Created: 17/Aug/18 Updated: 27/Oct/23 Resolved: 20/Aug/18 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Aggregation Framework |
| Affects Version/s: | 4.0.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Mark [X] | Assignee: | Nick Brewer |
| Resolution: | Works as Designed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
|||||||||||||||||||||||||||||||||||||||||||||||
| Operating System: | ALL | |||||||||||||||||||||||||||||||||||||||||||||||
| Steps To Reproduce: | Run this python code:
|
|||||||||||||||||||||||||||||||||||||||||||||||
| Participants: | ||||||||||||||||||||||||||||||||||||||||||||||||
| Description |
|
Let us have two collections, COL1 and COL2. Let's assume COL1 has many documents. If we aggregate COL1 and end it with an $out operation to COL2, while in parallel we aggregate (over and over again for the purpose of this demonstration) over COL2, at the moment that COL2 is rebuilt (the moment that the aggregation over COL1 with the $out has ended) - the aggregation over COL2 will fail.
The exception raised in the code I have given below is: OperationFailure: Error in $cursor stage :: caused by :: all indexes on collection dropped
I have tested this with a mongo 4.0.0 instance inside a docker on a Windows 10 machine. |
| Comments |
| Comment by Nick Brewer [ 20/Aug/18 ] | ||||||||||||||||
|
Segal I have run the script, and the output it provides actually makes the issue much clearer:
From the placement of the "written to db" and "done writing" messages, we can see that the error on the COL2 aggregation stage occurs before the collection has finished being completely overwritten. This means you're attempting to aggregate a collection while its documents and indexes are being dropped - in this case, it's expected that the aggregation will fail. As you mentioned, adding retry logic is a good way to work around this issue. -Nick | ||||||||||||||||
| Comment by Mark [X] [ 17/Aug/18 ] | ||||||||||||||||
|
I suggest you try to run the example code I've given and see for yourself | ||||||||||||||||
| Comment by Mark [X] [ 17/Aug/18 ] | ||||||||||||||||
|
More emphasis:
This is the aggregation that fails and raises an exception. | ||||||||||||||||
| Comment by Mark [X] [ 17/Aug/18 ] | ||||||||||||||||
|
@Nick thanks for the quick reply!
The use case in our case is almost exactly as given in the example and has been thought out with great care. I know and understand that $out will remember the indexes and options - and that is expected and needed by our solution, so that is okay. However, unless I am missing something, no index or option "changes" in the example I have given. The example I gave has no indexes and no non-default options on both collections, thus this issue (as far as I can tell) has nothing to do with those. Second, you said if any index or option changes between the start and end of $out, it will fail. implying that the $out aggregation will fail. However that is not the case, the $out aggregation does not fail, it's the second aggregation that does! I expect that if our application does an aggregation on COL2 that aggregation will not raise such an exception, regardless if any other application places an $out aggregation to COL2. Given that the $out aggregation COL1->COL2 runs once in a while, and that this exception is raised in a small time window right after the $out aggregation, my workaround was to add a retrying mechanism to try again if such an exception is raised. | ||||||||||||||||
| Comment by Nick Brewer [ 17/Aug/18 ] | ||||||||||||||||
|
Segal I'm curious what use-case you're trying to accomplish here. Given that you're populating a collection and then immediately overwriting it, $out will remember the indexes and collection options from the original collection; if any index or option changes between the start and end of $out, it will fail. This behavior is expected. Could you clarify what you're expecting this script to accomplish? -Nick | ||||||||||||||||
| Comment by Mark [X] [ 17/Aug/18 ] | ||||||||||||||||
|
Actually, this has been found on production, so this happens also on a Linux host. |