[SERVER-36983] Views are unnecessary slow, even for very simple views Created: 04/Sep/18 Updated: 14/Oct/18 Resolved: 05/Sep/18 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Aggregation Framework, Querying |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Mark [X] | Assignee: | Kyle Suarez |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | read-only-views | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||||||||||||||||||
| Issue Links: |
|
||||||||||||||||||||
| Participants: | |||||||||||||||||||||
| Description |
|
Please see the following python code, and the explanation that follows:
Execution result on my machine
The code above puts 100k documents in an empty collection. This collections assumed to have no indexes and nothing else special. Then a view created on this collection, and the view's pipeline is to project only two of the many fields that documents in that collection have. Then, both the view and the collection are queried and it is found out that the raw collection outperforms the view about 100x faster. From my simple tests, I have also found out that: Querying the collection for a b.val that is found (and having limit=1) is fast. Querying the collection for a b.val that is not found isn't fast - it takes about 150ms. **The above results are expected: there is no index on this field. The unexpected results are for the view: *Querying the *view__ for any b.val, existing or not, while also having limit=1, still takes about 150ms! It's as if the document was found in the view and the query continues instead of stopping. |
| Comments |
| Comment by Kyle Suarez [ 05/Sep/18 ] | |||||||||||||||||
|
After being informed by redbeard0531 that I've done the profiling with a debug (i.e. non-production-worthy) build, I have recompiled the server and redone my benchmark. I used the following configuration:
I uploaded two new flame graphs and one more screenshot, and I think the picture is even clearer than before with the debug allocations out of the way. About 39% of the time spent in GetMoreCmd::Invocation::run() for a find on a view is spent in either Document::toBson() or Document::fromBsonWithMetadata(). | |||||||||||||||||
| Comment by Kyle Suarez [ 05/Sep/18 ] | |||||||||||||||||
|
Hello Segal, In short, I believe the slowness you're seeing is not limited just to views but to the aggregation framework in general ( MongoDB has two execution engines that answer queries. The first one, which we internally call the "query system", is the execution engine that's responsible for CRUD commands (insert, find, update, delete). It processes documents using the same binary format (BSON) as they are stored on disk. This format is small and compact, but can be expensive to manipulate. The second execution engine is the "aggregation system", which is used exclusively for the aggregate command and is built on top of the query system. Because aggregation pipelines tend to manipulate and reshape documents, instead of handling data as BSON, it transforms them into a hash table-like object that is internally called a "Document". This data structure makes it easy for the aggregation system to transform data while in the pipeline, and when it is ready to be returned to the user, it is converted back to BSON. I am fairly convinced that the slowness you are seeing is due to this conversion process from BSON to a "Document" and back to BSON. In general, the conversion is beneficial for most pipelines that apply transformations, but is expensive and technically unnecessary for pipelines which only use stages like $match or $project (which is a common pattern for read-only views). To test my theory, I modified your example to remove some variables. I populated a collection with 50000 documents and created a view on that collection with no aggregation stage transformations:
With perf(1), I profiled an empty-query find on the base collection, which will use the query system. You can see a full flame graph in coll-find.svg. Zooming in on GetMoreCmd::Invocation::run(), you can see that the majority of time is spent in CollectionScan::doWork(), which is what we want to see. I did the same thing, but with an empty query find on the view (see view-find.svg). All operations on read-only views are transformed into equivalent aggregations, so this will use the aggregation system. As you can see, there is a significant amount of time spent in Document::toBson() and Document::~Document() (in this experiment, roughly 34% of the time spent in GetMoreCmd::Invocation::run(). For both my experiment view and your $project view, the BSON-only query system is capable of answering the query, and could avoid the conversion from BSON to Document altogether. Because of this, I am going to close this ticket as a duplicate of the following two tickets:
I believe that, taken together, these two optimizations would greatly improve the performance of view queries for your use case. I'm sorry for the extremely long post and I hope this makes sense. Please let me know if you have any questions. Regards, |