[SERVER-80164] Improve speed of computing shape hash for MatchExpression Created: 16/Aug/23  Updated: 12/Oct/23

Status: Open
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Charlie Swanson Assignee: Backlog - Service Architecture
Resolution: Unresolved Votes: 0
Labels: former-pm-2885
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Related
related to SERVER-79736 Hash C++ data structures directly rat... Closed
Assigned Teams:
Service Arch
Participants:

 Description   

We have two similar ideas here worth exploring:

First, similar to this idea: https://github.com/mongodb/mongo/commit/525b08e5016fd1d194943215269cc027c5b8c57b

We could make a custom hasher which will hash combine "shapified" literals rather than the actual literal values.

Second, we could take the idea william.qian@mongodb.com wrote in this comment and perform some hashing as we go. I think this would likely save on some of the tech debt code duplication I'm imagining in the first idea, but I'm not sure how much it would improve perf if we are still building a BSON object which I don't think we actually need? If the idea is to hook it into the parser itself then it could definitely be faster but I'm not sure about the complexity without looking further.



 Comments   
Comment by Charlie Swanson [ 04/Oct/23 ]

sebastien.mendez@mongodb.com and denis.grebennicov@mongodb.com and jess.balint@mongodb.com - I am removing this idea from the PM-2885 epic as we don't plan to work on it before closing it out - will flag for scheduling as a proposal to send it to the backlog. Should we link it as related to your projects? I think each of you considered doing something like this. It would probably be a good idea but I am worried about the complexity and code duplication. 

Comment by Charlie Swanson [ 16/Aug/23 ]

cc denis.grebennicov@mongodb.com and sebastien.mendez@mongodb.com - we were talking about this first idea if I'm remembering correctly.

Comment by Charlie Swanson [ 16/Aug/23 ]

I'm tentatively linking this idea to M3 (WRITING-14659) since I think this would add more tech debt than it is worth, but I am open to reconsidering this, especially if SERVER-79736 doesn't recover as much perf as we want it to. If that ticket does work out, I'd consider either closing this as "won't do" or moving it into another epic like PM-412 which may need it more than we do.

Generated at Thu Feb 08 06:42:50 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.