[SERVER-76390] Refine UpdateIndexData class so that it has index to field path mapping Created: 21/Apr/23  Updated: 30/Jan/24

Status: Open
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Arun Banala Assignee: Backlog - Query Execution
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Related
is related to SERVER-70984 Decouple the concepts "in-place updat... Closed
Assigned Teams:
Query Execution
Backport Requested:
v7.3
Sprint: QI 2023-05-15, QI 2023-05-29, QI 2023-06-12, QI 2023-06-26, QI 2023-07-10
Participants:

 Description   

After SERVER-70984, we will stop using the UpdateIndexData data store in CollectionRoutingInfo. We have this information replicated on IndexCatalogEntryImpl class. We should consider keeping a single UpdateIndexData variable in the CollectionQueryInfo, but extend it so that it can store the FieldRef, std::string and allPathIndexed information for each separate index. This way it is also available for columnar indexes to obtain the separate paths without polluting the storage code



 Comments   
Comment by Irina Yatsenko (Inactive) [ 11/Jul/23 ]

alberto.massari@mongodb.com I did quick local profiles back in April to evaluate whether the ticket should have been prioritized for 7.0. I haven't saved the data from those runs except for the following excerpts. I don't know how much of this is still applicable now.

With just the primary index
     3.29%     0.40%  mongod              [.] mongo::IndexCatalogImpl::updateRecord
            |          
            |--2.89%--mongo::IndexCatalogImpl::updateRecord
            |          |--2.59%--mongo::doc_diff::anyIndexesMightBeAffected
            |          |          |--1.29%--mongo::doc_diff::(anonymous namespace)::anyIndexesMightBeAffected
            |          |          |          |--0.37%--mongo::FieldRef::appendPart
            |          |          |           --0.12%--mongo::doc_diff::DocumentDiffReader::nextUpdate
            |          |          |          
            |          |          |--0.49%--mongo::doc_diff::DocumentDiffReader::DocumentDiffReader
            |          |          |--0.27%--__strlen_avx2
            |          |           --0.17%--std::vector<unsigned long, std::allocator<unsigned long> >::_M_default_append
            |          |          
            |           --0.14%--tc_deletearray_sized
 
With 8 secondary indexes
     4.68%     0.38%  mongod              [.] mongo::IndexCatalogImpl::updateRecord
            |--4.30%--mongo::IndexCatalogImpl::updateRecord
            |          |--3.87%--mongo::doc_diff::anyIndexesMightBeAffected
            |          |          |--2.69%--mongo::doc_diff::(anonymous namespace)::anyIndexesMightBeAffected
            |          |          |          |--0.86%--mongo::UpdateIndexData::mightBeIndexed
            |          |          |          |           --0.25%--mongo::FieldRef::getPart
            |          |          |          |          
            |          |          |          |--0.25%--mongo::FieldRef::appendPart
            |          |          |          |--0.21%--mongo::doc_diff::DocumentDiffReader::nextUpdate
            |          |          |          |--0.20%--__memcmp_avx2_movbe
            |          |          |           --0.11%--mongo::doc_diff::DocumentDiffReader::nextSubDiff
            |          |          |          
            |          |          |--0.45%--mongo::doc_diff::DocumentDiffReader::DocumentDiffReader
            |          |           --0.12%--__strlen_avx2
            |          |          
            |           --0.14%--std::vector<unsigned long, std::allocator<unsigned long> >::_M_default_append

My understanding is that the ticket tracks a technical debt Arun wanted to address only with a side-effect of a potential perf improvement (this is why I did the profiles in April). Please follow up with arun.banala@mongodb.com if more clarifications are needed.

Comment by Kyle Suarez [ 11/Jul/23 ]

alberto.massari@mongodb.com, ian.boros@mongodb.com, should we consider scheduling this improvement as part of one of the block processing projects? Or is it low-priority enough to either backlog or close? CC rushan.chen@mongodb.com

Comment by Irina Yatsenko (Inactive) [ 27/Apr/23 ]

I've profiled updating every record in a collection with 1e4 simple documents and 8 secondary indexes: IndexCatalogImpl::updateRecord() takes ~5% (including its callees). Most of this time (~4.5%) is spent inside doc_diff::anyIndexesMightBeAffected() . When updating 1e5 records (again, updating all documents in a larger collection), the relative weight of updateRecord() and anyIndexesMightBeAffected() goes down a little.

With no secondary indexes IndexCatalogImpl::updateRecord() takes ~3.5%.
So it doesn't look like optimizing UpdateIndexData on this path could bring tangible perf benefit.

Generated at Thu Feb 08 06:32:34 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.