[SERVER-60176] Delta-updates should only validate the diff for storage Created: 23/Sep/21  Updated: 11/Dec/23  Resolved: 08/Dec/21

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 5.3.0, 5.0.22

Type: Improvement Priority: Major - P3
Reporter: Henrik Edin Assignee: Nikita Lapkov (Inactive)
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: PDF File Report.pdf    
Issue Links:
Backports
Depends
Problem/Incident
Related
related to SERVER-61587 Research potential improvements in do... Closed
is related to SERVER-60156 Add a way to bypass storageValid() fo... Closed
Backwards Compatibility: Fully Compatible
Backport Requested:
v5.0
Sprint: QE 2021-10-18, QE 2021-11-01, QE 2021-11-15, QE 2021-11-29, QE 2021-12-13
Participants:
Linked BF Score: 20

 Description   

Currently, the delta update applies the diff to create the full document that needs to be stored this is then validated for storage in ObjectReplaceExecutor::applyReplacementUpdate. This means that data that is already stored is validated again.

We should be able to validate only the new data inside the docDiff.



 Comments   
Comment by Githook User [ 02/Oct/23 ]

Author:

{'name': 'Yuhong Zhang', 'email': 'danielzhangyh@gmail.com', 'username': 'YuhongZhang98'}

Message: SERVER-60156 SERVER-60176 Skip storageValid() for time-series updates and when applying the oplog on secondary

(cherry picked from commit 7591af2652cd1d00f9255033dc5ec3c3d82deca3)
(cherry picked from commit 21c2ad8764c396fc6afae226bbcea8c780e6bdee)

Co-authored-by: Nikita Lapkov <nikita.lapkov@mongodb.com>
Branch: v5.0
https://github.com/mongodb/mongo/commit/9be121627729aa754711f3ff1757ff789da300b5

Comment by Kyle Suarez [ 15/Dec/21 ]

After further correction from acm, this ticket is not contained on the v5.2 branch, so updating the fix version to "5.3.0".

Comment by Kyle Suarez [ 15/Dec/21 ]

Updating the fix version here from "5.3 Required" to "5.2.0-rc0":

kyle:~/code/mongo ♥ git describe 21c2ad8764c396fc6afae226bbcea8c780e6bdee
r5.2.0-alpha-918-g21c2ad8764

Comment by Githook User [ 08/Dec/21 ]

Author:

{'name': 'Nikita Lapkov', 'email': 'nikita.lapkov@mongodb.com', 'username': 'laplab'}

Message: SERVER-60176 Disable document validation for storage when applying the oplog on secondary
Branch: master
https://github.com/mongodb/mongo/commit/21c2ad8764c396fc6afae226bbcea8c780e6bdee

Comment by Geert Bosch [ 29/Sep/21 ]

To be clear, in some cases more than 99% of total execution time for updates is spend in this validation code. This should affect any updates to documents with large arrays or similar complex structure.

Comment by Henrik Edin [ 28/Sep/21 ]

kyle.suarez I linked SERVER-60156 as a related ticket. This came up during profiling of timeseries inserts where we append fields to an existing object. SERVER-60156 will take care of the immediate need for timeseries so this is more of a general performance improvement idea. But it can potentially be impacted on secondaries using the DeltaExecutor if the existing documents are large. We saw a quite major performance impact for timeseries.

Comment by Kyle Suarez [ 28/Sep/21 ]

henrik.edin, what was the motivation for filing this ticket – was it a general performance improvement idea, or is there a different project that might depend on this? The Query Execution team is inclined to consider it, but on the backlog, unless there is a reason we should investigate this more urgently.

Generated at Thu Feb 08 05:49:09 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.