[SERVER-79736] Hash C++ data structures directly rather than building BSONObj Created: 04/Aug/23  Updated: 19/Dec/23  Resolved: 19/Sep/23

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 7.2.0-rc0

Type: Improvement Priority: Major - P3
Reporter: Charlie Swanson Assignee: Charlie Swanson
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
depends on SERVER-80151 Make a sub-directory for query stats ... Closed
depends on SERVER-80201 Move QueryStatsEntry to its own file Closed
is depended on by SERVER-76330 Use representative query shape in que... Closed
is depended on by SERVER-85055 Tracking: M1 Performance improvement ... Closed
Problem/Incident
Related
related to SERVER-81144 Compute QueryShapeHash from QueryShap... Closed
related to SERVER-84298 Fix up memory accounting in query sta... Closed
related to SERVER-84299 Move absl::HashOf() off the hot path ... Closed
related to SERVER-84300 Avoid reading the clock to compute $$NOW Closed
is related to SERVER-79018 Implement MatchExpression hasher Closed
is related to SERVER-80164 Improve speed of computing shape hash... Open
Backwards Compatibility: Fully Compatible
Sprint: QO 2023-08-21
Participants:
Linked BF Score: 5

 Description   

To compute the query stats key hash, we build an intermediate BSON object and hash that. This is showing up in our flame graphs, and we can probably avoid some of the intermediate state and associated overhead and memory allocations by just directly supporting a custom hash operator of some of the same things.

To limit the complexity of the patch, this ticket will track the work for all the components of the commands (think readConcern, comment, etc) but not the MatchExpression tree and associated ASTs.



 Comments   
Comment by Githook User [ 19/Sep/23 ]

Author:

{'name': 'Charlie Swanson', 'email': 'charlie.swanson@mongodb.com', 'username': 'cswanson310'}

Message: SERVER-79736 Add custom types for query shapes and the like

A big re-organization of the code, mostly in the name of performance to allow an absl::HashOf the components rather than needing to append them all to a big BSONObj to hash.
Branch: master
https://github.com/mongodb/mongo/commit/1c3cb63e7b3fc67c3d15752f071100e802ed1eb9

Comment by Charlie Swanson [ 16/Aug/23 ]

william.qian@mongodb.com I didn't seriously consider that idea for this ticket - DM me if you'd still like to pitch for it. I think it might make sense in the context of the match expression so I split out SERVER-80164 to track that effort/idea stream

Comment by William Qian [ 08/Aug/23 ]

Along similar veins, I was also thinking that maybe we could build the hash as we built the BSON document. For example, if we had this document:

{
  a: {
    b:
    - c: d
    - e: [{f: g}, {h: {j: k}}]
  }
}

we could hash the elements as we traversed the tree, with the parents' hashes simply being a composition of the child hashes.

hash({
  a: hash({
    b: hash([
      hash({c: hash(d)}),
      hash({
        e: hash([
            hash({f: hash(g)}),
            hash({h: hash({
              j: hash(k)})})])})
    ])
  })
})

We could also use some kind of prefix code to build the tree, since we only really care about the shape and not the actual values, so the representation space isn't quite as large.

Not sure if this idea has been explored or is worth exploring as an alternative.

Generated at Thu Feb 08 06:41:45 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.