[SERVER-67019] Rename StringSet and StringDataSet to clarify they are unordered Created: 03/Jun/22  Updated: 16/Aug/22  Resolved: 16/Aug/22

Status: Closed
Project: Core Server
Component/s: Internal Code
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Charlie Swanson Assignee: Charlie Swanson
Resolution: Won't Do Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Participants:

 Description   

I was just working on a patch and incorrectly assumed that we had no data structure that could support an unordered set of unowned strings (StringData) other than this one: StringData::ComparatorInterface::StringDataUnorderedSet.

It turns out that `StringDataSet` is the same thing, but without the collator interface virtual method overhead. In my context I never needed to support collation, so this would have been the better choice.

To improve visibility and discoverability, I would suggest `UnorderedStringSet` and `UnorderedStringDataSet`.



 Comments   
Comment by Charlie Swanson [ 16/Aug/22 ]

I'm going to close this as "Won't Do" since I didn't anticipate any pushback on the naming. Mathias and I are on different pages, so it seems many team members might be as well. My thinking was that everyone has gotten used to the heavily verbose unordered_blah in C++ standards, so we should just get on board with the verbosity and not have any confusion.

Comment by Mathias Stearn [ 17/Jun/22 ]

"Hash" is certainly better than "Unordered", but I personally would just leave it off. StringSet is our goto type if you want a set of string, and StringMap<V> is our goto type if you want a map from strings to V. Nice and simple. I don't think we need to describe every detail about the type in its name, especially very common types like StringMap. If you want to know about how it is implemented, that is a good time to go look at the header (maybe we could use some more comments here? Although a typedef to absl::flat_hash_map seems clear enough that it is a hash table I guess). I recognize that the other types in the family are much newer and therefore less widely used, but I think it makes sense to keep the naming consistent there.

Comment by Charlie Swanson [ 10/Jun/22 ]

mathias@mongodb.com thanks for the commentary. I can understand the criticisms you present, the names would get very long. I am somewhat confused by your comment though. The last paragraph seems to imply to me that you do not want to bother renaming these at all? But the first part indicates maybe you're still in favor of a clarification here, but you think the word "unordered" is too long? So maybe would propose something like String(Data)?HashSet and String(Data)?HashMap?

Comment by Mathias Stearn [ 09/Jun/22 ]

Personally, I'd rather we not make the names of String(Data)?(Set|Map) worse just to match the naming of the stdlib. The stdlib's use of unordered is the result of a lot of historical bad decisions, including that some implementations already had something with the name std::hash_map as a non-portable extension. Note that even absl which backs our stdx::unordered_foo types uses (node|flat)hash(map|set) rather than unordered, even though it aims for std compatability.

I think the best way to think of StringMap<V> and friends is that they should be the default type you use when you want to establish a mapping between a string or StringData and some V. You should consider using it first and only use another type if you need different guarantees (eg, collation-aware comparisons, node-based invalidation rules, or a well-defined iteration order).

Comment by Sebastien Mendez [ 07/Jun/22 ]

charlie.swanson@mongodb.com: Would you like to do this? Meanwhile, I'm putting it in Backlog

Comment by Charlie Swanson [ 03/Jun/22 ]

meta note: I'm not sure who owns this code now... Definitely could have the wrong backlog. I am also happy to do this if I could get someone considered to be a maintainer to volunteer to review.

Generated at Thu Feb 08 06:07:04 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.