[SERVER-58181] changestream fullDocument lookup projections introduce overheads in degenerate cases Created: 30/Jun/21  Updated: 06/Dec/22

Status: Backlog
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Oren Ovadia Assignee: Backlog - Query Execution
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Assigned Teams:
Query Execution
Participants:

 Description   

We are interested in projecting a subset of fields from the post-image document in a changestream.

Testing showed (also see CLOUDP-91896) that performance only improves when the projection is smaller than some fraction of the total document size (as a function of network delay etc).

When most fields in the document are projected, there could be a 30% slowdown.

Is it possible to optimize changestream projection so that overheads for degenerate cases is minimized? Otherwise it is hard to determine whether or not Search can use this optimization for a certain collection.

Note that in this case, the projection only applies to fields nested under `fullDocument.*`, so it is possible to push the projection down to the query system (by a query optimization or by exposing new APIs)



 Comments   
Comment by Bernard Gorman [ 16/Jul/21 ]

I'm going to put this on the backlog for now. Even if we were to push the projection down so that we perform it during the lookup instead of immediately afterwards, the performance benefit would at best be negligible - we would be doing the same amount of work, just at location A instead of location B. If we were going to add this functionality into our optimizer, we would also want it to be a more general-purpose optimization, which would require the ability to break apart a $project based on dependency analysis, which we currently cannot do.

Generated at Thu Feb 08 05:43:49 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.